Feature #3605

[Workbench] improved dashboard page

Added by Ward Vandewege almost 3 years ago. Updated 3 months ago.

Status:ClosedStart date:09/15/2014
Priority:NormalDue date:
Assignee:-% Done:

60%

Category:-
Target version:-
Story points2.0Remaining (hours)0.00 hour
Velocity based estimate-

Description

High level overview (counts) of:
  • pipelines currently running
  • jobs currently running
  • pipelines recently finished
  • jobs recently finished

All of this subject to current permissions. No changes to the permission model required.

Give the user a way to display only their stuff.

All of the items that have been counted in the high-level summary appear in a table below the high-level overview, with more detail (again, subject to user permission). It should say what project things are in, and if permissions allow, whose they are.

Add a warning triangle or red exclamation mark if anything is wrong, for example no log output in the last 3 minutes for jobs/pipelines that are supposed to be running. Add a column in the table with status: if normal, put "queued since ...". If something is wrong, put the best error message we can come up with (e.g. "no recent log output, dead?").

Dashboard-1.tiff - Compute nodes panel (379 KB) Radhika Chippada, 09/19/2014 08:01 pm

Dashboard-2.tiff - Recently finished pipelines panel (450 KB) Radhika Chippada, 09/19/2014 08:01 pm


Subtasks

Task #3863: Review 3605-node-info-fieldResolvedPeter Amstutz

Task #3723: Implement dashboard layout at https://arvados.org/attachm...ResolvedPeter Amstutz

Task #3887: Fix testsClosedPeter Amstutz

Task #3886: Review 3605-queue-position-sizeResolvedPeter Amstutz

Task #3884: Review 3605-improved-dashboardClosedPeter Amstutz


Related issues

Related to Arvados - Feature #4456: [Workbench] Provide more feedback about when a queued job... New 11/06/2014

Associated revisions

Revision 1f1963df
Added by Peter Amstutz over 2 years ago

Merge branch '3605-node-info-field' refs #3605

Revision 274ca7a6
Added by Peter Amstutz over 2 years ago

Merge branch '3605-node-info-field' refs #3605

Conflicts:
services/api/db/structure.sql

Revision 403d9f6b
Added by Peter Amstutz over 2 years ago

Merge branch '3605-queue-position-size' refs #3605

Revision b96a5d94
Added by Peter Amstutz over 2 years ago

Merge branch '3605-improved-dashboard' refs #3605

Revision 3e3df723
Added by Tom Clegg over 2 years ago

Report crunch_worker_state=down for nodes not occupying a worker slot. refs #3605

Revision eec77d22
Added by Peter Amstutz over 2 years ago

Merge branch '3605-improved-dashboard' refs #3605

History

#1 Updated by Ward Vandewege almost 3 years ago

  • Subject changed from [Workbench] dashboard page to [Workbench] improved dashboard page
  • Story points set to 2.0

#2 Updated by Ward Vandewege almost 3 years ago

  • Description updated (diff)

#3 Updated by Ward Vandewege over 2 years ago

  • Target version changed from Arvados Future Sprints to 2014-09-17 sprint

#4 Updated by Peter Amstutz over 2 years ago

  • Assignee set to Peter Amstutz

#5 Updated by Peter Amstutz over 2 years ago

give some indication of how busy the cluster is, such as number of nodes busy/idle

#6 Updated by Radhika Chippada over 2 years ago

Review feedback for branch 3605-node-info-field:

Just a couple very minor comments:

  • apps/workbench/app/views/application/_arvados_attr_value.html.erb
    Just an observation. I tried to fix this in the past and Tom suggested that the user's of this should not send null attr parameter :).
  • Is it possible to test the newly added api accessible info attribute? May be an integration test?
  • Several api server tests are failing for me; but they are also failing in master branch. So, may not be anything to do with your code updates.

So, if api server tests are not failing for you, LGTM. Thanks.

#7 Updated by Peter Amstutz over 2 years ago

  • Status changed from New to In Progress

#8 Updated by Peter Amstutz over 2 years ago

  • Target version changed from 2014-09-17 sprint to 2014-10-08 sprint

#9 Updated by Brett Smith over 2 years ago

  • Target version changed from 2014-10-08 sprint to 2014-09-17 sprint

Reviewing fc2096f2.

So, we just had an issue where adding a new test fixture along with a new test broke an older test, because it was expecting a set of items to match an exact list. In general, I think it's worth trying to write tests to avoid these kinds of situations, and save our future selves some headache. For the job queue test, could we check for the presence of absence of a particular job UUID based on the user making the request? I think that would catch the same functionality while leaving more future flexibility.

The job queue size test is admittedly a tougher nut to crack. I'm not aware of a Railsy way to introspect all the test fixtures, and you'd have to reimplement some of the filtering yourself to figure out the "right" number, which is a slightly risky proposition. If you can think of a way to keep it flexible, that'd be awesome, but I understand if it's just not doable with our current infrastructure.

A couple of method FYIs, not necessary for merging, just in case they're helpful in the future: Ruby Enumerables have each_with_index to yield both the item and its index simultaneously. Our API server tests have a json_response method to save you incantation of decoding @response.body yourself.

Thanks.

#10 Updated by Brett Smith over 2 years ago

  • Target version changed from 2014-09-17 sprint to 2014-10-08 sprint

#11 Updated by Tom Clegg over 2 years ago

(As discussed in IRC) Please add a new "properties" hash on the node model, leaving info as an admin-only hash, rather than combining the fields in one hash and pruning it at runtime.

#12 Updated by Tom Clegg over 2 years ago

Looking at 4cf16efd on 3605-node-info-field

The .to_i in the tests reveal that a .to_i should have happened during node#ping or even way back at arvados/v1/nodes_controller.rb line 29 when they were accepted from the client using a type-unaware encoding. (But I know this isn't a new bug, so I'll understand if you punt.)

Everything else looks great. Thanks.

#13 Updated by Peter Amstutz over 2 years ago

3605-node-info-field changed #ping to do to_i on total_cpu_cores, total_ram_mb, and total_scratch_mb.

#14 Updated by Radhika Chippada over 2 years ago

Peter,
As we discussed, I will provide my feedback in two steps.

First, testing UI feedback:

  • The new dashboard has much better information than before. Thanks.
  • None of my feedback items below are critical, so you can either ignore them or address them later, whatever works for you.
  • Compute nodes panel has a drop-down arrow, but is not really showing any drop-down menu.
  • Clicking on the Panel title (Compute node status) or any of the items (Queue, Nodes, Busy, Idle) displays a new row (please see the attachment 1). Is it possible to have a separate icon in this panel or panel title to show this new row rather than showing it when any one of them is clicked. It is not obvious otherwise.
  • The information in the Compute node status panel needs to be reversed. I think it is desirable that we have a title Row (Queue Nodes Busy Idle) on top, and the number (info) row below it.
  • In the "Recently finished pipelines" panel, it might be helpful if the output file names are listed to the right (pull-right), so that they are displayed right below the Output button.
  • Alternatively, you might want to display the "Active for ..." in the right (under the "created_at" timestamp) and move the Outputs button to the left under the pipeline name.

Thanks.

#15 Updated by Radhika Chippada over 2 years ago

My second review feedback:

  • apps/workbench/app/assets/javascripts/dates.js

The timestamp display in the dashboard (2:41 PM 09/19/2014) is much different than in other areas in workbench such as Pipeline instance Advanced tab ( 2014-09-19 18:43:43 UTC). The timestamps in all workbench pages should use the same display strategy to be consistent.

I also think (2:41 PM 09/19/2014) is better to be displayed as (09/19/2014 2:41 PM)

  • Compute nodes panel

Unless there is a serious performance concern about displaying all the details during dashboard loading, I think we should display the details that appear when individual items in that panel are clicked (compute0, compute1 etc) right when Queue, Nodes. We also should consider replacing the clicks on Queue, Nodes etc with a single clickable item (such as the drop down in the panel title)

  • workbench/app/controllers/application_controller.rb

I think the helper_method :running_pipelines should use desc order.

PipelineInstance.order(["started_at asc", "created_at asc"

Currently, the individual day’s items are displayed in reverse chronological order ( 4xphq-d1hrv-vng31w4sz1j7f6c Started at 2:31 PM 07/08/2014, 4xphq-d1hrv-tq6scqzdm61fnub Started at 2:38 PM 07/08/2014 etc)

  • apps/workbench/app/models/job.rb

After merging in work from 3899, this should go away? (“def self.state job”)

  • UserProfileTest < ActionDispatch::IntegrationTest
    “assert page.has_text?('Active pipelines'), 'No text - My projects' “ change to say “No text - Active pipelines”

#16 Updated by Peter Amstutz over 2 years ago

Radhika Chippada wrote:

Peter,
As we discussed, I will provide my feedback in two steps.

  • Compute nodes panel has a drop-down arrow, but is not really showing any drop-down menu.

It does toggle collapsing of the nodes and provides an affordance for clicking. I don't think a separate icon would be any clearer.

  • Clicking on the Panel title (Compute node status) or any of the items (Queue, Nodes, Busy, Idle) displays a new row (please see the attachment 1). Is it possible to have a separate icon in this panel or panel title to show this new row rather than showing it when any one of them is clicked. It is not obvious otherwise.

The table is no longer clickable to avoid confusion thinking that different that the table cells are buttons.

  • The information in the Compute node status panel needs to be reversed. I think it is desirable that we have a title Row (Queue Nodes Busy Idle) on top, and the number (info) row below it.

Since this is one of the few parts that's based on an actual design suggestion from our graphic designer, I'd prefer to keep in this way.

  • In the "Recently finished pipelines" panel, it might be helpful if the output file names are listed to the right (pull-right), so that they are displayed right below the Output button.

Fixed.

#17 Updated by Tom Clegg over 2 years ago

Minor thing - in _show_advanced.html.erb, the spelling of "curl" is deliberately not "Curl", please don't fix. http://curl.haxx.se/

In ApplicationController.running_pipelines, it looks like pipelines that are running in the user's terminal will not show up here; they'll be hidden until they fail/succeed, at which point they will show up in the "recently finished" section. It would be better to show both kinds, and show a distinction between the two Running states. (The sentence "The new pipeline instance will also show up on the Workbench Dashboard." at http://doc.arvados.org/user/topics/running-pipeline-command-line.html should continue to be correct.)

default_show.html.erb is an unhelpful filename in an area where it's already hard to follow which file does what. Could we use a filename more suggestive of what it does? (It looks like it renders the page title, name, description, and copy/move buttons.)
  • This is probably a good opportunity to fix the "content_for :tab_line_buttons inside content_for :content_top" weirdness. Presumably those should be two separate content_for sections?
  • This is probably a good opportunity to remove the :page_content functionality that (AFAIK) has never been used or tested. It looks like it has already drifted far, far away from any likelihood of functioning correctly.

#18 Updated by Peter Amstutz over 2 years ago

Radhika Chippada wrote:

My second review feedback:

  • apps/workbench/app/assets/javascripts/dates.js

The timestamp display in the dashboard (2:41 PM 09/19/2014) is much different than in other areas in workbench such as Pipeline instance Advanced tab ( 2014-09-19 18:43:43 UTC). The timestamps in all workbench pages should use the same display strategy to be consistent.

#3629

I also think (2:41 PM 09/19/2014) is better to be displayed as (09/19/2014 2:41 PM)

  • Compute nodes panel

Unless there is a serious performance concern about displaying all the details during dashboard loading, I think we should display the details that appear when individual items in that panel are clicked (compute0, compute1 etc) right when Queue, Nodes. We also should consider replacing the clicks on Queue, Nodes etc with a single clickable item (such as the drop down in the panel title)

I'm not entirely sure I understand the comment. The compute nodes list is hidden by default because if the number of compute nodes is very large (having 20 or 40 nodes in the future seems perfectly reasonable) it takes up a lot of space.

  • workbench/app/controllers/application_controller.rb

I think the helper_method :running_pipelines should use desc order.

PipelineInstance.order(["started_at asc", "created_at asc"

Currently, the individual day’s items are displayed in reverse chronological order ( 4xphq-d1hrv-vng31w4sz1j7f6c Started at 2:31 PM 07/08/2014, 4xphq-d1hrv-tq6scqzdm61fnub Started at 2:38 PM 07/08/2014 etc)

That's intentional, the idea is to show the running pipelines at the top.

  • apps/workbench/app/models/job.rb

After merging in work from 3899, this should go away? (“def self.state job”)

Probably, that code is also able to make sense of the jobs that are copied into the pipeline components, but I have to avoid using the job record copy because it's sometimes stale.

  • UserProfileTest < ActionDispatch::IntegrationTest
    “assert page.has_text?('Active pipelines'), 'No text - My projects' “ change to say “No text - Active pipelines”

Fixed.

#19 Updated by Tom Clegg over 2 years ago

There seem to be several items in the story description that aren't addressed yet (or maybe I'm missing them):
  • Display only my stuff
  • Show project & owner for each pipeline
  • Warning signal for error conditions (e.g., no log output)
  • Error message
The "Compute node status" area is a little confusing in that the first figure is a number of jobs (not compute nodes), and nodes=idle+busy. I think it would be clearer to
  • drop the "nodes" total
  • relabel "idle" to "idle nodes"
  • relabel "busy" to "busy nodes"
  • relabel "queue" to "queued jobs"
  • relabel "Compute node status" panel title to "Cluster status" (or "Crunch job and worker status", or...?)

The expand/collapse feature on "Compute node status" panel is weird in that it doesn't expand/collapse the panel, but affects just a portion of the panel, far away from the caret. It's also a bit disappointing to click "expand" and see such a tiny amount of information that it might as well have been shown up front. Perhaps the easiest solution here is to show all nodes all the time, but put them in a div with a max-height that gets scrollbars when the number of nodes is big.

#20 Updated by Peter Amstutz over 2 years ago

Tom Clegg wrote:

Minor thing - in _show_advanced.html.erb, the spelling of "curl" is deliberately not "Curl", please don't fix. http://curl.haxx.se/

Reverted.

In ApplicationController.running_pipelines, it looks like pipelines that are running in the user's terminal will not show up here; they'll be hidden until they fail/succeed, at which point they will show up in the "recently finished" section. It would be better to show both kinds, and show a distinction between the two Running states. (The sentence "The new pipeline instance will also show up on the Workbench Dashboard." at http://doc.arvados.org/user/topics/running-pipeline-command-line.html should continue to be correct.)

Put it back, didn't add new UI to distinguish them. There are stale client-run pipelines on 4xphq that need to be stopped.

default_show.html.erb is an unhelpful filename in an area where it's already hard to follow which file does what. Could we use a filename more suggestive of what it does? (It looks like it renders the page title, name, description, and copy/move buttons.)

Changed to "title_and_buttons".

  • This is probably a good opportunity to fix the "content_for :tab_line_buttons inside content_for :content_top" weirdness. Presumably those should be two separate content_for sections?

Fixed.

  • This is probably a good opportunity to remove the :page_content functionality that (AFAIK) has never been used or tested. It looks like it has already drifted far, far away from any likelihood of functioning correctly.

Took that out.

#21 Updated by Peter Amstutz over 2 years ago

  • Target version changed from 2014-10-08 sprint to Arvados Future Sprints

#22 Updated by Peter Amstutz 12 months ago

  • Assignee deleted (Peter Amstutz)

#23 Updated by Peter Amstutz 3 months ago

  • Status changed from In Progress to Closed

Will dashboard design in future tickets.

#24 Updated by Tom Clegg 3 months ago

  • Target version deleted (Arvados Future Sprints)

Also available in: Atom PDF