Story #8647


[Workbench] "Work unit" user interface

Added by Tom Clegg over 6 years ago. Updated over 3 years ago.

Assigned To:
Target version:
Start date:
Due date:
% Done:


Estimated time:
Story points:


A work unit is a "view model". It corresponds to an Arvados object like a Job or a Pipeline Instance. Its purpose is to provide the information needed by views. As much as possible, we want our views to not know/care exactly which type of object the work unit represents.

For example, a dashboard view might show a heterogeneous list of pipeline instances, jobs, and (crunch2) containers. The view code is simple: it says "show progress bar at X%", it doesn't say "if showing a pipeline then add all components' progress and divide by # components". The latter is the kind of code that belongs in the work unit.

class Job
  def work_unit label, label)

class WorkUnit
  # This is just an abstract class that documents the WorkUnit interface; a
  # class can implement the interface without being a subclass of WorkUnit.
  def label
    # returns the label that was assigned when creating the work unit
  def uuid
    # returns the arvados UUID of the underlying object
  def components
    # returns an array of component (child) work units

class ProxyWorkUnit < WorkUnit
  def initialize proxied, label
    self.label = label
    self.proxied = proxied
  def label
  def uuid

class JobWorkUnit < ProxyWorkUnit
  def components
    n = -1 do |t|
      t.work_unit("task #{n}")

Related issues

Related to Arvados - Story #8651: [Workbench] Add "work unit" Ruby interface definitionClosed03/07/2016

Actions #1

Updated by Brett Smith over 6 years ago

  • Subject changed from [Workbench] "Work unit" user interface and model abstraction to [Workbench] "Work unit" user interface
  • Category set to Workbench

Splitting this up into multiple stories.

Actions #2

Updated by Brett Smith over 6 years ago

Three different views:

  1. Dashboard
  2. Project listing
  3. Detailed pages

First two could potentially be merged with a little reconsideration.

Things users want to know about work units:

  • Progress
    • Current state
    • How long it's taken
  • Compute utilization
    • Graph view of utilization over time
    • Aggregate utilization statistics: average, max utilization
    • What compute node(s) it's running on
  • Logs
  • Ownership
    • Who requested the work
    • What project does it live in
    • What is its parent work unit
  • Component names
  • Similar work units
    • Previous runs
  • Node type, including price
    • This has to be associated with the work unit, because node prices change over time
    • Estimated price on other clouds
  • Requested runtime constraints
    • Note that work units will use fractional nodes under Crunch v2
  • What it actually ran: command, versions of all the things, parameters
  • Raw API record

Listings should show topmost parent work units as much as possible. Any other listing should be part of an explicit search, or an admin-y type interface.

Actions #3

Updated by Brett Smith over 6 years ago

Would be good to separate crunchstat logs vs. job stderr.

Nuisances with the current "finished job log" view:
  • NOBODY ACTUALLY USES THIS. Everybody just downloads the full log.
  • When the text you're interested in spans pages.
  • Doesn't copy+paste correctly.
  • Usually the last megabyte or so is more interesting than the first megabyte (although some of the pre-task 0 information is useful to diagnose). Really what Brett wants is to be able to collapse all the successful tasks (including uninteresting output from, e.g., loading the Docker image).
Filtering seems potentially useful. Why isn't it?
  • crunchstat vs. stderr separation is job #1
  • The thing you want to filter is often beyond the first megabyte

Would be cool to have a view to grep the logs, and then provide a sharable link with those results.

Two common use cases for logs:
  • Show me why my job failed.
  • It's not obvious, so give me the full log.
    Jenkins seems to do this fairly well.

We can download logs from the back and only render stderr until we hit a designated rendering threshold. If we render it as plain text, we can render a lot more. We can stop downloading once we've shown enough of the stderr tail.

Could give keep-web a grepping capability.

The first log viewer we have will be Jenkins-like. Functional requirements:
  • Render logs from the bottom.
  • Filter out crunchstat output.
  • Stop rendering after a designated treshold.
  • Offer a link to download the full log past that threshold, just like jobs have now.
Actions #4

Updated by Brett Smith about 6 years ago

We have user interface improvements that we know we want to make. e.g., showing GATK Queue child jobs. But having one branch change both the user interface and the programming implementation makes for a hairier review. Seems nice to avoid that.

Nice impact would be for job page and pipeline instance page to use the same code.

Let's just show one level of child work units at a time for now. Showing multiple levels is a bigger UI change.

Stories this enables: cwl-runner can stop making pipeline instances.

Actions #5

Updated by Tom Clegg about 6 years ago

  • Description updated (diff)
Actions #6

Updated by Peter Amstutz about 5 years ago

  • Status changed from New to Resolved
Actions #7

Updated by Tom Morris over 3 years ago

  • Release deleted (12)

Also available in: Atom PDF