Bug #22127
closedevent loop slowdowns during normal data table page loads
Added by Peter Amstutz 6 months ago. Updated 12 days ago.
Description
The red bars represent a stalled event loop, this means the browser is basically unresponsive during these times. With a debug build, total time spent is about 8s.
The release build is better, but event loop is still blocked for about 2s:
One thing that is suspicious is that the stacked blue bars represent "requestContainerStatusCount" and it is only when all those requests are completed that there is a break in the event loop being blocked.
Files
wb2-debug-jank.png (325 KB) wb2-debug-jank.png | Peter Amstutz, 09/17/2024 05:55 PM | ||
wb2-release-jank.png (342 KB) wb2-release-jank.png | Peter Amstutz, 09/17/2024 05:58 PM | ||
profile_w_stagger.png (279 KB) profile_w_stagger.png | Lisa Knox, 12/18/2024 09:50 PM | ||
projectpanelatarefresh.png (344 KB) projectpanelatarefresh.png | Lisa Knox, 12/18/2024 09:50 PM | ||
actionCount_base.png (98.7 KB) actionCount_base.png | Lisa Knox, 12/18/2024 09:50 PM | ||
actionCount_no_setCount.png (94.8 KB) actionCount_no_setCount.png | Lisa Knox, 12/18/2024 09:50 PM | ||
actionCount_no_midware.png (89.3 KB) actionCount_no_midware.png | Lisa Knox, 12/18/2024 09:50 PM | ||
actionCount_modified.png (95.1 KB) actionCount_modified.png | Lisa Knox, 12/18/2024 09:58 PM | ||
arvados-workbench2-3.1.0~dev20250210170035-1.x86_64.rpm (4.76 MB) arvados-workbench2-3.1.0~dev20250210170035-1.x86_64.rpm | Peter Amstutz, 02/17/2025 04:27 PM |
Updated by Peter Amstutz 6 months ago
- Subject changed from filterResources is inefficient to event loop slowdowns during normal data table page loads
Updated by Peter Amstutz 6 months ago
- Assigned To changed from Peter Amstutz to Stephen Smith
Updated by Peter Amstutz 6 months ago
- Description updated (diff)
- File wb2-release-jank.png wb2-release-jank.png added
- File wb2-debug-jank.png wb2-debug-jank.png added
Updated by Peter Amstutz 6 months ago
From the "flame" chart, it also looks like it might be doing multiple expensive re-renders in that 'blocked event loop' period. Perhaps this is happening each time the redux state changes? What is happening here is that actions trigger other actions which independently mutate the state in several steps, which might trigger intermediate renders that are never seen by the user but are still slows things down.
Updated by Peter Amstutz 6 months ago
- Target version changed from Development 2024-09-25 sprint to Development 2024-10-09 sprint
Updated by Peter Amstutz 5 months ago
- Target version changed from Development 2024-10-09 sprint to Development 2024-10-23 sprint
Updated by Peter Amstutz 5 months ago
- Target version changed from Development 2024-10-23 sprint to Development 2024-11-06 sprint
Updated by Peter Amstutz 4 months ago
- Target version changed from Development 2024-11-06 sprint to Development 2024-11-20
Updated by Peter Amstutz 4 months ago
- Target version changed from Development 2024-11-20 to Development 2024-12-04
Updated by Peter Amstutz 3 months ago
- Target version changed from Development 2024-12-04 to Future
Updated by Peter Amstutz 3 months ago
- Target version changed from Future to Development 2025-01-29
Updated by Peter Amstutz 3 months ago
- Target version changed from Development 2025-01-29 to Development 2025-01-08
- Assigned To set to Lisa Knox
Updated by Lisa Knox 3 months ago
- File actionCount_modified.png actionCount_modified.png added
- File actionCount_no_midware.png actionCount_no_midware.png added
- File actionCount_no_setCount.png actionCount_no_setCount.png added
- File actionCount_base.png actionCount_base.png added
- File projectpanelatarefresh.png projectpanelatarefresh.png added
- File profile_w_stagger.png profile_w_stagger.png added
What I believe is the primary source of lag is the way Data Explorer middlewares are initiated combined with the fact that all each new middleware triggers its own SET_COLUMNS call in each Data Explorer. There are 19 Data Explorers and 1 middleware for each, leading to an n^2 situation (19 * 19 = 361). Each Data Explorer has its own set of actions that are dispatched as well.
Here are all the actions dispatched when Workbench, viewing the Project Panel, is refreshed on the main branch (note that many things happen exactly 19 times, some happen 38 times (19 * 2), and SET_COLUMNS happens 380 times (19^2 + 19):
If we remove the calls to SET_COLUMNS, we see a marked improvement:
If we remove the middlwares that are not directly affecting the Project Panel:
In this state, workbench works 100% as expected as long as the only Data Explorer viewed is the Project Panel, including switching between projects. There is still some lag when switching between projects. This might be addressed by using web workers to offload some of the work to another JS thread and/or using scheduler.yield() where appropriate, which is where I will focus my efforts after this initial phase.
The long-term solution to the SET_COLUMNS problem is a substantial rewrite with the aim of lazily-loading each Data Explorer when needed, instead of loading all 19 on the initial page load/refresh.
I am still investigating alternatives to our current middleware implementation. These cannot be done lazily, but if we can offload their side effects until they are needed, we can still save a thousands of unnecessary actions.
I also investigated the "requestContainerStatusCount" issue, and implemented a wrapper that just yields control momentarily after each of the requests resolves. This doesn't speed up the page load time (it actually increases it by a tiny but nonzero amount), but it does break up the jank at several points, meaning UI input can be handled more often.
I have also applied some memoization to reduce some of the more problematic actions. The Data Table now renders twice instead of 16 times per page load, and a slight refactor means the side panel tree item renderer gets called 98 times vs 512 times. The reason for this is that it used a calculation that was located within the mapStateToDispatch for that component. This means it was re-calculated every time Redux state updated, whether or not the calculation was relevant to the component. Best prectice is that mapStateToProps should only select and map state properties directly, not compute derived data. Going through and removing any similar instances of this will go a long way to removing excess re-renders.
So far, my work on 22127-optimization @ 1b8d65a0bf940d5a7fe4a13bafc377ecdd3d0dad has yielded the following results:
The gross jank time, beginning to end, in the above picture is just under 3 seconds with the longest individual jank lasting 835ms. By comparison, Instagram, possibly the most optimized React application there is, averages 1.5-2 seconds of jank time on (logged-in) refresh.
Updated by Lisa Knox 2 months ago
We should also change it so that as soon as the user clicks to navigate to a new url, a loading screen is shown until the next resource is loaded. This will help mitigate visual confusion, as it is possible now that the Details Card of the new resource will be displayed before the data table changes.
Updated by Peter Amstutz 2 months ago
Lisa Knox wrote in #note-16:
We should also change it so that as soon as the user clicks to navigate to a new url, a loading screen is shown until the next resource is loaded. This will help mitigate visual confusion, as it is possible now that the Details Card of the new resource will be displayed before the data table changes.
+1
We discussed a few weeks ago using "skeleton" loading indicators (#22360), which are a bit less jarring & produce less flicker (compared to wiping the whole panel and showing a spinner) when transitioning between contents. I agree that when the old visual content stays on screen too long, it's very confusing.
Updated by Peter Amstutz about 2 months ago
- Target version changed from Development 2025-01-08 to Development 2025-01-29
Updated by Lisa Knox about 2 months ago
22127-optimization @ ef59cf0e2bd8eb09df15f2d3e6951078df8caba4
developer-run-tests-services-workbench2: #1396
- - All agreed upon points are implemented / addressed.
- Points implemented/addressed are listed below
- ✅ Anything not implemented (discovered or discussed during work) has a follow-up story.
- The biggest improvement that could still be made is lazily loading the Data Explorers, but this is a major refactor that is probably not worth the effort
- ✅ Code is tested and passing, both automated and manual, what manual testing was done is described
- Manually tested by clicking around madly and trying to trigger UI changes while the page was loading
- ✅ New or changed UX/UX and has gotten feedback from stakeholders.
- - Documentation has been updated.
- n/a
- ✅ Behaves appropriately at the intended scale (describe intended scale).
- Intended scale is a single user navigating through the appplication
- ✅ Considered backwards and forwards compatibility issues between client and server.
- No changes to server interaction
- ✅ Follows our coding standards and GUI style guidelines.
- The main improvements here:
- `fetchProcessProgressBarStatus` now resolves promises sequentially instead of using `Promise.all()`
- Context Menus are now rendered lazily instead of on Data Explorer initialization
- moved several calculations that were done inside `mapStateToProps` to inside their components
- memoized lots of calculations and components, focusing on those that rerender excessively
- added `shouldComponentUpdate` to the larger class components so they only update when necessary
- removed several unused props in various components
- updated tests because some components now take different props
- moved contextMenu test from component to e2e testing
Updated by Lisa Knox about 2 months ago
22127-wb-optimization @ 4fb545539ef6c7453aab4058f0cc415ca2a34b63
developer-run-tests-services-workbench2: #1398
The changes in #22159 are incompatible with the changes in this branch. After some discussion, the decision was made to revert those changes in favor of these changes. The 22127-wb-optimization branch is the new branch rebased onto main after reverting #22159.
Updated by Peter Amstutz about 1 month ago
- Target version changed from Development 2025-01-29 to Development 2025-02-12
Updated by Stephen Smith about 1 month ago
- I like the use of
preventRerender
to suppress extra renderings, but I think it would be easier to understand if it was inverted to the positive eg. shouldRender, so that understanding it doesn't involve double negatives like preventRerender=false means rendering - In the subprocess progress bar, the
dataExplorerId
is intended to be used when the progress bar should respect the type filter of an associated data explorer. Since the subprocess panel doesn't have a type filter, havingdataExplorerId
default to the subprocess panel doesn't really do anything, and could cause confusion when it's left blank (and defaults to subprocess panel), so I would recommend leaving that without a default - On the project panel runs tab, it seems that changing the type filter takes effect, but the filter selection UI doesn't update, and the progress bar also doesn't re-render to update to the new selection if the new filter changes the number of failed runs
- Also in the subprocess progress bar, the
typeFilter
prop is no longer used since it's calculated inside the component, so it can be removed fromProgressBarDataProps
- I noticed that
progressIndicatorActions
is nowprogressIndicatorsActions
, since it's really 1 shared progress indicator I think keeping it singular is better since it's easier to type and also matches the filename - In
collection-panel.tsx:291
, we can get rid of the type cast if we wrap the contents inif(this.state.item)
, should make it a bit safer - I noticed the change from dispatching
getResource
to indexing intostate.resources
, if the removal of the dispatch is necessary for performance, I would recommend abstracting that into a helper (maybe getResourceFromState or something) with type paramT extends Resource = Resource
that returnsT | undefined
, as it is, indexing into the resource array causes TS to think it's guaranteed to get a Resource back, so this would make sure the dispatch-less resource fetch has the same type safety - I also noticed that getProperty is replaced by indexing into
state.properties
, since getProperty is synchronous and not dispatched but a closure that just indexes into the state.properties that it's passed, I think the performance should be the same so I think for readability we should stick togetProperty
- Similar thing with
getUserUuid
incollection-panel.tsx
which was replaced withstate.auth.user?.uuid
, getUserUuid is basically doing the same thing so I think it shouldn't hurt performance to stick to using it
Updated by Peter Amstutz about 1 month ago
Stephen Smith wrote in #note-25:
- I like the use of
preventRerender
to suppress extra renderings, but I think it would be easier to understand if it was inverted to the positive eg. shouldRender, so that understanding it doesn't involve double negatives like preventRerender=false means rendering
I had glanced at the code before and had the same thought, but I think what's going on here is that the semantics of react.memo is testing for equality of old/new values. So probably we just want to give the methods a less confusing name, like hasSameContent()
.
Updated by Lisa Knox 29 days ago
22127-wb-optimization @ 1674e22384022cd705a01819ff234718600883b5
developer-run-tests-services-workbench2: #1408
- I like the use of
preventRerender
to suppress extra renderings, but I think...
I'm actually going to push back on this one, as I went back and forth several times on the name of this function and decided that prevertRerender
is best. The callback is defined by react to only allow rerender if it returns false. Peter's suggestion of hasSameContent() doesn't tell you what the function does, and someone who is not familiar with React.memo wouldn't know what the function does at first glance. There is a comment at each instance explaining to return true to prevent re-render, false to allow re-render
. Not willing to die on this hill, but I feel strongly that preventRerender is the best name for this function.
- In the subprocess progress bar, the
dataExplorerId
is intended...
I put the default in during an intermediate stage and forgot to remove it.
- On the project panel runs tab, it seems that changing the type filter...
Filter selection and progress bar update properly now
- Also in the subprocess progress bar, the
typeFilter
prop is no longer used since it's calculated inside the component, so it can be removed fromProgressBarDataProps
done
- I noticed that
progressIndicatorActions
is nowprogressIndicatorsActions
, since it's really 1 shared progress indicator I think keeping it singular is better since it's easier to type and also matches the filename
Again, at an intermediate stage, that rename made sense.
- In
collection-panel.tsx:291
, we can get rid of the type cast if we wrap the contents inif(this.state.item)
, should make it a bit safer
done
- I noticed the change from dispatching
getResource
to indexing...
Fixed, and getResourceFromState was applied a few other placed to take advantage of its memoization. There were some places where getResourceFromState would not work, possibly due to the memoization of the child component? In these places, I added explicit type annotations and null checks instead.
- I also noticed that getProperty is replaced by indexing into
state.properties
...
done
I also memoized breadcrumbs, which got skipped in the first iteration of the ticket
Updated by Stephen Smith 29 days ago
- I've just realized that
getResource
is not a dispatchable thunk so performance-wise it should be about as performant as:
const getResource = <T extends Resource = Resource>(id: string, state: ResourcesState): T | undefined => {
return state[id] as T;
}
And at any rate the getResourceFromState
is functionally identical to getResource which is what clued me in lol, so the original use of getResource
doesn't actually involve dispatch and just adds a plan JS function call overhead on top of indexing into the resource state which shouldn't need any more optimization beyond memoization and doesn't need a separate helper - sorry for the runaround on that one
- The project panel runs tab filters now work when clicked, but the progress bar doesn't update when the type filter is changed - the existing behavior is that when the type filter is set to run/step/run+step then the progress bar reflects the progress of that type (status filters don't affect it since any status filter would result in a single segmented progress bar)
- I also saw there are still a few remaining instances of directly indexing into resources / properties that can be reverted:
- project-panel-run.tsx
getResource<ProjectResource>(projectUuid)(state.resources);
getProjectPanelCurrentUuid(state) || "";
- details-card-root.tsx -
getResource(properties.currentRouteUuid)(resources);
- workflow-panel.tsx -
getProperty<string>(WORKFLOW_PANEL_DETAILS_UUID)(state.properties);
- breadcrumbs.ts -
(getProperty<Breadcrumb[]>(BREADCRUMBS)(properties) || [])
- project-panel-run.tsx
Updated by Lisa Knox 27 days ago
22127-wb-optimization @ 5ff94d2dfe2fa28db530c0bfb2c7768b525f23b3
developer-run-tests-services-workbench2: #1416
The Details Card reversion caused a series of other bugs that it was decided in discussion were not worth chasing down. That reversion was skipped and the indexing into the state object remains.
The Project details card had a similar intermittent bug with the current item being unavailable, so I added a null check.
Everything else was updated per feedback
Updated by Stephen Smith 26 days ago
Passing this back for now after some discussion about the progress bar not updating and reversion of the removal of getState/Property
Updated by Lisa Knox 21 days ago
22127-wb2-optimization @ fffd7eada66be02cf4f3eb1ad452222a2175af0d
developer-run-tests-services-workbench2: #1430
(all workbench tests pass)
All points addressed, plus I discovered a bug where items in the sidePanelTree wouldn't update their frozen status, which is fixed now.
Updated by Peter Amstutz 20 days ago
- Target version changed from Development 2025-02-12 to Development 2025-02-26
Updated by Stephen Smith 19 days ago
- It looks like there are a bunch of merge conflicts now with main, so I would recommend pulling in main again
- In
tree.tsx
, the TreeItemWeight enum can be auto incrementing for simplicity by removing= N
from each line main-panel-root.tsx
andprogress-indicator-actions.ts
, there are 2 WORKBENCH_LOADING_SCREEN const definitionscollection-panel.tsx
I noticed that componentDidMount checks ifthis.state.item
is set, but sincestate.item
is initialized as null and the only places that setState is insidecomponentDidMount
andcomponentDidUpdate
, which I think is only called aftercomponentDidMount
, I think that check will always return false preventing that block from running. It looks like the purpose is to initialize the state if there is an item, so should it be checkingif(!this.state.item)
or something else? I'm also wondering if checking the state is necessary since you might want to always overwrite the state with the collection fetched usingmatch.params.id
1 line above, regardless if there's already state. I'm not sure but just want to double check that- In the same file it also doesn't look like
isFrozen
is being set anywhere aside from the state initialization to null, but it's checked in 1 place
- In the same file it also doesn't look like
- In
resources.ts
, the getResource nested memoize doesn't seem to work if the resource changes but the ID is the same - the outer memoize will return the same value without evaluating whether the resource store in the inner memoize changed. Recommend removing the outer memoize so only memoizing on the inner function that accepts resource state - In
group-details-panel
we should stick togetUserUuid(state)
for the user uuid - Also we can use
getCurrentGroupDetailsPanelUuid(state.properties)
since we already use that helper function elsewhere - In
project-panel-run
andproject-panel
we should usegetProjectPanelCurrentUuid(state)
to get theprojectUuid
/currentItemId
to keep things consistent - In
project-input
line 59 we should usegetUserUuid(state)
since it does the same thing and is more consistent - I noticed a lot of guards for getResource for undefined UUIDs, what if we change
getResource
to acceptstring | undefined
and then doid ? state[id] as T : undefined
, then all consumers can just call getResource without worry (it already returns undefined when resource isn't found by indexing into resources with an invalid key so callers shouldn't need any changes to handle that)
Updated by Peter Amstutz 15 days ago
- File arvados-workbench2-3.1.0~dev20250210170035-1.x86_64.rpm arvados-workbench2-3.1.0~dev20250210170035-1.x86_64.rpm added
Attached in-development rocky8 package with Workbench 2 optimizations, for testing.
Updated by Lisa Knox 14 days ago
22127-wb2-optimization @ 0f277306342386aba05c8c6569740ed6381f9be0
developer-run-tests-services-workbench2: #1436
All bullet points addressed and main merged into this branch.
Updated by Lisa Knox 13 days ago
- Status changed from In Progress to Resolved
Applied in changeset arvados|6e4553995eb45424e666b38f13948e3d04b949aa.