Story #8647

[Workbench] "Work unit" user interface

Added by Tom Clegg over 1 year ago. Updated 2 months ago.

Status:ResolvedStart date:03/07/2016
Priority:NormalDue date:
Assignee:-% Done:

0%

Category:Workbench
Target version:-
Story points-
Velocity based estimate-
ReleaseRFCs

Description

A work unit is a "view model". It corresponds to an Arvados object like a Job or a Pipeline Instance. Its purpose is to provide the information needed by views. As much as possible, we want our views to not know/care exactly which type of object the work unit represents.

For example, a dashboard view might show a heterogeneous list of pipeline instances, jobs, and (crunch2) containers. The view code is simple: it says "show progress bar at X%", it doesn't say "if showing a pipeline then add all components' progress and divide by # components". The latter is the kind of code that belongs in the work unit.

class Job
  def work_unit label
    JobWorkUnit.new(self, label)
  end
end

class WorkUnit
  # This is just an abstract class that documents the WorkUnit interface; a
  # class can implement the interface without being a subclass of WorkUnit.
  def label
    # returns the label that was assigned when creating the work unit
  end
  def uuid
    # returns the arvados UUID of the underlying object
  end
  def components
    # returns an array of component (child) work units
  end
end

class ProxyWorkUnit < WorkUnit
  def initialize proxied, label
    self.label = label
    self.proxied = proxied
  end
  def label
    self.label
  end
  def uuid
    self.proxied.uuid
  end
end

class JobWorkUnit < ProxyWorkUnit
  def components
    n = -1
    self.proxied.job_tasks.map do |t|
      n++
      t.work_unit("task #{n}")
    end
  end
end

Related issues

Related to Arvados - Story #8651: [Workbench] Add "work unit" Ruby interface definition Closed 03/07/2016

History

#1 Updated by Brett Smith over 1 year ago

  • Subject changed from [Workbench] "Work unit" user interface and model abstraction to [Workbench] "Work unit" user interface
  • Category set to Workbench

Splitting this up into multiple stories.

#2 Updated by Brett Smith over 1 year ago

Three different views:

  1. Dashboard
  2. Project listing
  3. Detailed pages

First two could potentially be merged with a little reconsideration.

Things users want to know about work units:

  • Progress
    • Current state
    • How long it's taken
  • Compute utilization
    • Graph view of utilization over time
    • Aggregate utilization statistics: average, max utilization
    • What compute node(s) it's running on
  • Logs
  • Ownership
    • Who requested the work
    • What project does it live in
    • What is its parent work unit
  • Component names
  • Similar work units
    • Previous runs
  • Node type, including price
    • This has to be associated with the work unit, because node prices change over time
    • Estimated price on other clouds
  • Requested runtime constraints
    • Note that work units will use fractional nodes under Crunch v2
  • What it actually ran: command, versions of all the things, parameters
  • Raw API record

Listings should show topmost parent work units as much as possible. Any other listing should be part of an explicit search, or an admin-y type interface.

#3 Updated by Brett Smith over 1 year ago

Would be good to separate crunchstat logs vs. job stderr.

Nuisances with the current "finished job log" view:
  • NOBODY ACTUALLY USES THIS. Everybody just downloads the full log.
  • When the text you're interested in spans pages.
  • Doesn't copy+paste correctly.
  • Usually the last megabyte or so is more interesting than the first megabyte (although some of the pre-task 0 information is useful to diagnose). Really what Brett wants is to be able to collapse all the successful tasks (including uninteresting output from, e.g., loading the Docker image).
Filtering seems potentially useful. Why isn't it?
  • crunchstat vs. stderr separation is job #1
  • The thing you want to filter is often beyond the first megabyte

Would be cool to have a view to grep the logs, and then provide a sharable link with those results.

Two common use cases for logs:
  • Show me why my job failed.
  • It's not obvious, so give me the full log.
    Jenkins seems to do this fairly well.

We can download logs from the back and only render stderr until we hit a designated rendering threshold. If we render it as plain text, we can render a lot more. We can stop downloading once we've shown enough of the stderr tail.

Could give keep-web a grepping capability.

The first log viewer we have will be Jenkins-like. Functional requirements:
  • Render logs from the bottom.
  • Filter out crunchstat output.
  • Stop rendering after a designated treshold.
  • Offer a link to download the full log past that threshold, just like jobs have now.

#4 Updated by Brett Smith over 1 year ago

We have user interface improvements that we know we want to make. e.g., showing GATK Queue child jobs. But having one branch change both the user interface and the programming implementation makes for a hairier review. Seems nice to avoid that.

Nice impact would be for job page and pipeline instance page to use the same code.

Let's just show one level of child work units at a time for now. Showing multiple levels is a bigger UI change.

Stories this enables: cwl-runner can stop making pipeline instances.

#5 Updated by Tom Clegg over 1 year ago

  • Description updated (diff)

#6 Updated by Peter Amstutz 2 months ago

  • Status changed from New to Resolved

Also available in: Atom PDF