Project

General

Profile

Actions

Idea #9319

closed

[Workbench] All processes index

Added by Brett Smith almost 8 years ago. Updated almost 8 years ago.

Status:
Resolved
Priority:
Normal
Assigned To:
Radhika Chippada
Category:
Workbench
Target version:
Start date:
05/31/2016
Due date:
Story points:
1.0

Description

Implement an index page that displays all processes (containers/pipeline instances/jobs).

The index page should use the same partial to render processes that we use to render child processes for pipeline instances and jobs.

The index page should include a search box to live filter results.

Remove the current "All pipeline instances" and "All jobs" buttons from the Dashboard. On the Recently Finished Processes pane, add a "All processes" button that links to this page.


Subtasks 1 (0 open1 closed)

Task #9554: Review branch 9319-all-processes-indexResolvedRadhika Chippada05/31/2016Actions

Related issues

Related to Arvados - Idea #9318: [Workbench] Update Dashboard to show processesResolvedRadhika Chippada06/01/2016Actions
Actions #1

Updated by Radhika Chippada almost 8 years ago

  • Status changed from New to In Progress
  • Assigned To set to Radhika Chippada
  • Target version set to 2016-07-06 sprint
  • Story points set to 2.0
Actions #2

Updated by Radhika Chippada almost 8 years ago

  • Target version changed from 2016-07-06 sprint to 2016-07-20 sprint
Actions #3

Updated by Radhika Chippada almost 8 years ago

  • % Done changed from 0 to 60
  • Story points changed from 2.0 to 1.0
Actions #4

Updated by Tom Clegg almost 8 years ago

First impression: Loading is really slow -- it takes >30 seconds to load one page of infinite scroll. There are tons of API calls, including
  • fetching the same pipeline template multiple times per row (even though no pipeline template info is shown here)
  • fetching links for multiple collections per row (fetching the links could be existing cruft from back when that's how collection name lookups worked... but why multiple collections per row, when only the output is shown?)

Can we fix some of this?

Seems like the controller should be called "WorkUnitsController" rather than AllProcessesController...
  • "process" is just a user-facing term for a work unit, we should still call them work units in the code
  • this controller isn't "all x" any more than the jobs controller is "all jobs"

These integration tests look like they spend a lot of time testing infinite scroll, which is already tested elsewhere. How about just a controller test that tests the behavior of this controller specifically?

(TBC)

Actions #5

Updated by Tom Clegg almost 8 years ago

It looks like find_objects_for_index has a bunch of code copied from projects_controller. Can we find some way to re-use code here instead of adding debt?

When merging results from multiple queries with limit=N, I think we need to truncate the merged set to N entries before displaying or generating the next-page filter. Otherwise, we'll get something like this:
  • receive pipeline instances with created_at={25,24,23,22,21}
  • receive jobs with created_at={20,19,18,17,16}
  • receive container requests with created_at={15,14,13,12,11}
  • sort all together by created_at, take created_at of last entry → 11
  • next page query filter is "created_at <= 11" ... so we'd never see a pipeline instance with created_at=20.

A more concise way to merge without the intermediate "procs" hash:

@objects = (jobs.to_a + pipelines.to_a + crs.to_a).sort_by(&:created_at).reverse

Re. "lots of 'links' queries" issue mentioned above: the preload_links_for_objects call in load_contents_objects is why the project contents tabs don't have this problem.

Instead of [%w(uuid is_a) + [%w(arvados#pipelineInstance)]] how about just [["uuid", "is_a", ["arvados#pipelineInstance"]]] ...? Same result, but easier to read...

Actions #6

Updated by Radhika Chippada almost 8 years ago

Loading is really slow ... fetching the same pipeline template multiple times per row (even though no pipeline template info is shown here)

pipeline_instance.rb -> friendly_link_name fetches the pipeline_template when the pipeline_instance has no name. This method was not using cache and hence fetching it again and again. Enhanced this method to use cache

fetching links for multiple collections per row

Added preload_links_for_objects

Seems like the controller should be called "WorkUnitsController" rather than AllProcessesController...

Changed this. This required a view/work_units folder. Hence, I moved all the files from the views/work_unit folder into the work_units folder and updated all references to make sure we do not have confusion with the two folders

These integration tests look like they spend a lot of time testing infinite scroll, which is already tested elsewhere. How about just a controller test that tests the behavior of this controller specifically?

Left one test in integration tests to ensure the scrolling behavior is working correctly (which in fact helped find a bug - explained below) and moved the rest into controller tests

It looks like find_objects_for_index has a bunch of code copied from projects_controller. Can we find some way to re-use code here instead of adding debt?

Much improved now. The index method was being called twice due to infinite scrolling. Updated the find_objects_for_index to not fetch objects when the partial is not present. Do you see any issues with this that I missed? Also, luckily the integration test showed that the previous last_created_at time filter was using seconds precision due to which objects matching this time with additional nanoseconds (and hence greater than created_at being sought) were not getting pulled. This is now using ns precision for the created_at filter. (May be we have this issue at other places in the code ...). I did not see much benefit in terms of code readability and maintenance to refactor between this and projects_controller. Also, removed the copied over render_index implementation and added one more param handling to application_controller -> render_index to allow using in this context

When merging results from multiple queries with limit=N, I think we need to truncate the merged set to N entries before displaying or generating the next-page filter. Otherwise, we'll get something like this:

This is correct. I had it right and broke it and now fixed!

@objects = (jobs.to_a + pipelines.to_a + crs.to_a).sort_by(&:created_at).reverse

Done

Instead of [%w(uuid is_a) + [%w(arvados#pipelineInstance)]] how about just "uuid", "is_a", ["arvados] ...? Same result, but easier to read...

Done

Actions #7

Updated by Tom Clegg almost 8 years ago

  • Category set to Workbench

Much better, thanks. Performance vastly improved!

The "truncate to N items of any type" fix seems to have revealed a different bug, though: if the first set of infinite-scroll results doesn't fill the browser window (i.e., it's already time to load the next results, even though there's no scroll/resize event) the second page doesn't start loading until I resize my window. It's possible this bug exists with all of our infinite scroll pages, but we don't usually see it because we get enough results in the first page to fill a window.

Do we need to call ping_all_scrollers() at the end of $.ajax(...).done() in maybe_load_more_content, in case one page of results wasn't enough to fill the window?

I did not see much benefit in terms of code readability and maintenance to refactor between this and projects_controller.

Hmmmm. There were two copies of that timestamp precision bug, but you only fixed one.

Instead of fixing precision for one or two specific cases, can we do what we did in API server and fix it across the board? See source:services/api/config/initializers/time_format.rb

Actions #8

Updated by Radhika Chippada almost 8 years ago

  • The "truncate to N items of any type" fix seems to have revealed a different bug, though: if the first set of infinite-scroll results doesn't fill the browser window ... Do we need to call ping_all_scrollers() at the end of $.ajax(...).done() in maybe_load_more_content, in case one page of results wasn't enough to fill the window?

Thanks for this tip. This resolved this issue and now two pages are loaded when my browser window is big and first page contents are too few

I did not see much benefit in terms of code readability and maintenance to refactor between this and projects_controller ... Hmmmm. There were two copies of that timestamp precision bug, but you only fixed one.

I refactored the next_page_filters logic into application_controller and updated both the project_controller and work_unit_controller to use the refactored method.

Instead of fixing precision for one or two specific cases, can we do what we did in API server and fix it across the board? See source:services/api/config/initializers/time_format.rb

Please elaborate on this. I refactored the next_page_filters logic, but this might help further, especially in other areas. But I am not sure what / where this applies. Thanks.

Actions #9

Updated by Tom Clegg almost 8 years ago

Yes, infinite scroll bug is fixed now, thanks!

Thanks for unifying the "next page for multiple types" code.

I don't think it makes sense to call next_page_filters with custom rejections, though. The idea is to remove the filters that were added by next_page_filters itself, and those are always [['created_at', nextpage_operator, ...], ['uuid', 'not in', ...]] -- the caller shouldn't have to know/provide that information. Just passing the nextpage_operator should be enough.

What's the purpose of #{params[:partial_path] || ''}? The projects code didn't need this, and the "all processes" page seems to work if I take it out. {partial: "show_foo"} defaults to using the current controller, and isn't that always what we want?

Re time_format: in API server, we monkey-patch as_json, so the default representation of a timestamp in JSON responses uses the high-precision format. Seems to me we should do exactly the same thing in Workbench. Then, instead of last_created_at.strftime("%Y-%m-%dT%H:%M:%S.%N%z") we can just say last_created_at, and the right format will be applied automatically.

Actions #10

Updated by Radhika Chippada almost 8 years ago

I don't think it makes sense to call next_page_filters with custom rejections, though. The idea is to remove the filters that were added by next_page_filters itself, and those are always 'created_at', nextpage_operator, ...], ['uuid', 'not in', ... -- the caller shouldn't have to know/provide that information. Just passing the nextpage_operator should be enough.

Updated accordingly

What's the purpose of #{params[:partial_path] || ''}? The projects code didn't need this, and the "all processes" page seems to work if I take it out. {partial: "show_foo"} defaults to using the current controller, and isn't that always what we want?

You are right. I needed to specify the path prefix "work_unit/" earlier when I was using the all_processes_controller name. Now that we have standardized names for the controller and views folder, I no longer need this. Removed it.

Re time_format: in API server, we monkey-patch as_json, so the default representation of a timestamp in JSON responses uses the high-precision format. Seems to me we should do exactly the same thing in Workbench. Then, instead of last_created_at.strftime("%Y-%m-%dT%H:%M:%S.%N%z") we can just say last_created_at, and the right format will be applied automatically.

Yes, I copied the time_format.rb initializer to work_bench and the work_units integration test passing as expected. Thanks.

Actions #11

Updated by Tom Clegg almost 8 years ago

LGTM @ f974126, thanks!

Actions #12

Updated by Radhika Chippada almost 8 years ago

  • Status changed from In Progress to Resolved

Applied in changeset arvados|commit:e6cd7c31ee28851d3ead992437fc93f2fc73ef92.

Actions

Also available in: Atom PDF