Story New In Progress Resolved Feedback Closed
Sprint Impediments
9433
[OPS] use official repos for docker. Stop packaging docker.io
Nico César
288
3
impediments
-c-a
1
10766
[Docs] [arvados-ws] make the arvados-ws documentation official, remove all mentions of the old puma websockets setup
Tom Clegg
3
3
impediments
-c-a
2
11373
[OPS] Make qr1hi crunchV2
Nico César
288
3
impediments
-c-a
1
10981
[OPS] migrate c97qk to ubuntu1604
Javier Bértoli
398
3
impediments
-c-a
2
11017
[API Server] Implement Docker version compatibility fallback support
Tom Clegg
3
3
impediments
-c-a
6
10757
[OPS] [arvados-ws] deploy arvados-ws on our test clusters: 4xphq, 9tee4, tb05z, c97qk
Ward Vandewege
1
3
impediments
-c-a
2
9632
[OPS] Upgrade docker to 1.9.1 in all clusters
Nico César
288
3
impediments
-c-a
1
8465
[Crunch2] Support stdin/stderr redirection
Radhika Chippada
72
3
impediments
-c-a
2
11262
deploy bdaa9de9882fee122fd2274d92ea500113df8195 or later in c97qk
Nico César
288
3
impediments
-c-a
1
Subject: [SDK] Support writes to offsets beyond end of file
Tracker ID: Bug
Status: In Progress
Category:
Points:
Estimation (hours):
Spent Time: 0.0
Remaining (hours):
Assignee: Peter Amstutz
Project: Arvados
Release:

Two distinct but closely related issues encountered by a customer:

  1. Using s3.download_fileobj() with an ArvadosFileWriter target: files over some threshold (likely 8MB) are arbitrarily truncated.
  2. Using "aws s3 cp" to download to a writable keep mount: a 15 GB file results in only an 8 MB file appearing it keep(!)

I believe the common theme here is that they both rely on the Python SDK and are using multipart download (request and commit chunks 8 MB at a time) which suggests some issue with the way it writes chunks non-sequentially (possibly relating to seek() or ftruncate()).

11510 Peter Amstutz (0 hours)
[SDK] Support writes to offsets beyond end of file
11528
Review 11510-sdk-extend-files
Peter Amstutz
47
11510
3
36
-c-a
5
Subject: [API] Upgrade API server to Rails 4.2
Tracker ID: Feature
Status: In Progress
Category:
Points:
Estimation (hours):
Spent Time: 0.0
Remaining (hours):
Assignee: Tom Clegg
Project: Arvados
Release:

Among other things, this will provide:
- 4.0 - encrypted session cookies, cf. http://api.rubyonrails.org/classes/ActionDispatch/Session/CookieStore.html
- 4.2 - jsonb column type (requires PostgreSQL 9.4)

7709 Tom Clegg (0 hours)
[API] Upgrade API server to Rails 4.2
11319
Review 7709-sdk-cli-active_support
Lucas Di Pentima
375
7709
3
36
-c-a
5
11298
make tests pass in rails4
Tom Clegg
3
7709
3
36
-c-a
5
11297
update bundle
Tom Clegg
3
7709
3
36
-c-a
5
11316
make other components' integration tests pass
Tom Clegg
3
7709
3
36
-c-a
5
11337
Check ruby warnings and Rails upgrade notes
Tom Clegg
3
7709
3
36
-c-a
5
11264
Review 7709-api-rails4
Lucas Di Pentima
375
7709
3
36
-c-a
5
Subject: Support Python3 for arvados-python-client & command line utilities
Tracker ID: Story
Status: In Progress
Category: SDKs
Points: 0.5
Estimation (hours):
Spent Time: 0.0
Remaining (hours):
Assignee: Tom Clegg
Project: Arvados
Release:

Python 3 has been available for many years and the Python 2/3 migration is reaching its final stages. As a first step, Arvados Python SDK + command line needs to run on both Python 3 and Python 2.

run-tests.sh needs to run tests using both Python 2.7 and Python 3.4+

Dependencies must support Python 3. May require updating dependencies and related side effects (API/behavior changes).

Use "__future__" to adopt Python 3 behavior on Python 2.7:

from __future__ import division, absolute_import, print_function, unicode_literals

Clean up usage of strings to conform to Python 3 unicode/bytes distinction.

Existing Arvados Python packages including FUSE, Node manager, arvados-cwl-runner (+ its dependencies, e.g. cwltool), and crunchstat-summary must continue to work on Python 2.7 (they will be ported in future stories). Ideally this can be resolved without divergent behavior between Py2 and Py3, however if there is a conflict between Py2 and Py3 behavior it needs to be resolved in favor of maintaining compatibility with existing dependencies on Py2.

Packages should be published to PyPi advertising compatibility with both Py2+Py3 .

Will need to build and publish separate Py2 and Py3 deb/rpm packages (due to hard dependency on Python interpreter).

11308 Tom Clegg (0 hours)
Support Python3 for arvados-python-client & command line utilities
0.5
11419
support text-mode open() in Python 3
Tom Clegg
3
11308
1
36
-c-a
5
11379
Review 11308-python3
Tom Clegg
3
11308
2
36
-c-a
5
11418
pass tests with python3
Tom Clegg
3
11308
3
36
-c-a
5
Subject: stuck keep fuse mounts not cleared by crunch-job
Tracker ID: Bug
Status: In Progress
Category: FUSE
Points:
Estimation (hours):
Spent Time: 0.0
Remaining (hours):
Assignee: Tom Clegg
Project: Arvados
Release:

crunch-job attempts to unmount any fuse filesystems that are mounted under $CRUNCH_TMP but it attempts to do so only using fusermount. Often on our system, this fails and a "umount -f <mount_point>" is required to make the node work again.

In addition, this often happens on multiple nodes at the same time - and by the time we have three nodes with wedged fuse mounts, they will rapidly fail all pending jobs. There seems to be no mechanism by which crunch dispatch can decide to stop trying to dispatch to a node that is broken.

Here is the log from a job that suffered from this issue.

dispatching job z8ta6-8i9sb-8mp2qww92moa644 {"docker_image"=>"mercury/gatk-3.5", "min_nodes"=>1, "max_tasks_per_node"=>10, "keep_cache_mb_per_task"=>1280} to humgen-05-07 z8ta6-7ekkf-sa1q59632vhxov6 {"total_cpu_cores":32,"total_ram_mb":257867,"total_scratch_mb":788561}
2017-02-28_17:23:33 salloc: Granted job allocation 17536
2017-02-28_17:23:33 58397  Sanity check is `/usr/bin/docker ps -q`
2017-02-28_17:23:33 58397  sanity check: start
2017-02-28_17:23:33 58397  stderr starting: ['srun','--nodes=1','--ntasks-per-node=1','/usr/bin/docker','ps','-q']
2017-02-28_17:23:33 58397  sanity check: exit 0
2017-02-28_17:23:33 58397  Sanity check OK
2017-02-28_17:23:33 z8ta6-8i9sb-8mp2qww92moa644 58397  running from /var/www/arvados-api/shared/vendor_bundle/ruby/2.1.0/gems/arvados-cli-0.1.20170217221854/bin/crunch-job with arvados-cli Gem version(s) 0.1.20170217221854, 0.1.20161017193526, 0.1.20160503204200, 0.1.20151207150126, 0.1.20151023190001
2017-02-28_17:23:33 z8ta6-8i9sb-8mp2qww92moa644 58397  check slurm allocation
2017-02-28_17:23:33 z8ta6-8i9sb-8mp2qww92moa644 58397  node humgen-05-07 - 10 slots
2017-02-28_17:23:33 z8ta6-8i9sb-8mp2qww92moa644 58397  start
2017-02-28_17:23:34 z8ta6-8i9sb-8mp2qww92moa644 58397  clean work dirs: start
2017-02-28_17:23:34 z8ta6-8i9sb-8mp2qww92moa644 58397  stderr starting: ['srun','--nodelist=humgen-05-07','-D','/data/crunch-tmp','bash','-ec','-o','pipefail','mount -t fuse,fuse.keep | awk "(index(\\$3, \\"$CRUNCH_TMP\\") == 1){print \\$3}" | xargs -r -n 1 fusermount -u -z; sleep 1; rm -rf $JOB_WORK $CRUNCH_INSTALL $CRUNCH_TMP/task $CRUNCH_TMP/src* $CRUNCH_TMP/*.cid']
2017-02-28_17:23:34 z8ta6-8i9sb-8mp2qww92moa644 58397  stderr fusermount: failed to unmount /data/crunch-tmp/crunch-job/task/humgen-05-07.10.keep: Invalid argument
2017-02-28_17:23:34 z8ta6-8i9sb-8mp2qww92moa644 58397  stderr srun: error: humgen-05-07: task 0: Exited with exit code 123
2017-02-28_17:23:34 z8ta6-8i9sb-8mp2qww92moa644 58397  clean work dirs: exit 123
2017-02-28_17:23:34 salloc: Relinquishing job allocation 17536
dispatching job z8ta6-8i9sb-8mp2qww92moa644 {"docker_image"=>"mercury/gatk-3.5", "min_nodes"=>1, "max_tasks_per_node"=>10, "keep_cache_mb_per_task"=>1280} to humgen-04-02 z8ta6-7ekkf-ekzlxvozts92sqm {"total_cpu_cores":40,"total_ram_mb":193289,"total_scratch_mb":68302106}
2017-02-28_17:23:35 salloc: error: Unable to allocate resources: Requested nodes are busy
2017-02-28_17:23:35 salloc: Job allocation 17539 has been revoked.
dispatching job z8ta6-8i9sb-8mp2qww92moa644 {"docker_image"=>"mercury/gatk-3.5", "min_nodes"=>1, "max_tasks_per_node"=>10, "keep_cache_mb_per_task"=>1280} to humgen-05-03 z8ta6-7ekkf-1i1v5zotflg26jn {"total_cpu_cores":32,"total_ram_mb":257867,"total_scratch_mb":788561}
2017-02-28_17:23:36 salloc: Granted job allocation 17540
2017-02-28_17:23:36 58715  Sanity check is `/usr/bin/docker ps -q`
2017-02-28_17:23:36 58715  sanity check: start
2017-02-28_17:23:36 58715  stderr starting: ['srun','--nodes=1','--ntasks-per-node=1','/usr/bin/docker','ps','-q']
2017-02-28_17:23:36 58715  sanity check: exit 0
2017-02-28_17:23:36 58715  Sanity check OK
2017-02-28_17:23:38 z8ta6-8i9sb-8mp2qww92moa644 58715  running from /var/www/arvados-api/shared/vendor_bundle/ruby/2.1.0/gems/arvados-cli-0.1.20170217221854/bin/crunch-job with arvados-cli Gem version(s) 0.1.20170217221854, 0.1.20161017193526, 0.1.20160503204200, 0.1.20151207150126, 0.1.20151023190001
2017-02-28_17:23:38 z8ta6-8i9sb-8mp2qww92moa644 58715  check slurm allocation
2017-02-28_17:23:38 z8ta6-8i9sb-8mp2qww92moa644 58715  node humgen-05-03 - 10 slots
2017-02-28_17:23:38 z8ta6-8i9sb-8mp2qww92moa644 58715  start
2017-02-28_17:23:38 z8ta6-8i9sb-8mp2qww92moa644 58715  clean work dirs: start
2017-02-28_17:23:38 z8ta6-8i9sb-8mp2qww92moa644 58715  stderr starting: ['srun','--nodelist=humgen-05-03','-D','/data/crunch-tmp','bash','-ec','-o','pipefail','mount -t fuse,fuse.keep | awk "(index(\\$3, \\"$CRUNCH_TMP\\") == 1){print \\$3}" | xargs -r -n 1 fusermount -u -z; sleep 1; rm -rf $JOB_WORK $CRUNCH_INSTALL $CRUNCH_TMP/task $CRUNCH_TMP/src* $CRUNCH_TMP/*.cid']
2017-02-28_17:23:38 z8ta6-8i9sb-8mp2qww92moa644 58715  stderr fusermount: failed to unmount /data/crunch-tmp/crunch-job/task/humgen-05-03.4.keep: Invalid argument
2017-02-28_17:23:38 z8ta6-8i9sb-8mp2qww92moa644 58715  stderr srun: error: humgen-05-03: task 0: Exited with exit code 123
2017-02-28_17:23:38 z8ta6-8i9sb-8mp2qww92moa644 58715  clean work dirs: exit 123
2017-02-28_17:23:38 salloc: Relinquishing job allocation 17540
2017-02-28_17:23:38 close failed in file object destructor:
2017-02-28_17:23:38 sys.excepthook is missing
2017-02-28_17:23:38 lost sys.stderr
dispatching job z8ta6-8i9sb-8mp2qww92moa644 {"docker_image"=>"mercury/gatk-3.5", "min_nodes"=>1, "max_tasks_per_node"=>10, "keep_cache_mb_per_task"=>1280} to humgen-04-02 z8ta6-7ekkf-ekzlxvozts92sqm {"total_cpu_cores":40,"total_ram_mb":193289,"total_scratch_mb":68302106}
2017-02-28_17:23:40 salloc: Granted job allocation 17544
2017-02-28_17:23:40 58985  Sanity check is `/usr/bin/docker ps -q`
2017-02-28_17:23:40 58985  sanity check: start
2017-02-28_17:23:40 58985  stderr starting: ['srun','--nodes=1','--ntasks-per-node=1','/usr/bin/docker','ps','-q']
2017-02-28_17:23:40 58985  sanity check: exit 0
2017-02-28_17:23:40 58985  Sanity check OK
2017-02-28_17:23:41 z8ta6-8i9sb-8mp2qww92moa644 58985  running from /var/www/arvados-api/shared/vendor_bundle/ruby/2.1.0/gems/arvados-cli-0.1.20170217221854/bin/crunch-job with arvados-cli Gem version(s) 0.1.20170217221854, 0.1.20161017193526, 0.1.20160503204200, 0.1.20151207150126, 0.1.20151023190001
2017-02-28_17:23:41 z8ta6-8i9sb-8mp2qww92moa644 58985  check slurm allocation
2017-02-28_17:23:41 z8ta6-8i9sb-8mp2qww92moa644 58985  node humgen-04-02 - 10 slots
2017-02-28_17:23:41 z8ta6-8i9sb-8mp2qww92moa644 58985  start
2017-02-28_17:23:41 z8ta6-8i9sb-8mp2qww92moa644 58985  clean work dirs: start
2017-02-28_17:23:41 z8ta6-8i9sb-8mp2qww92moa644 58985  stderr starting: ['srun','--nodelist=humgen-04-02','-D','/data/crunch-tmp','bash','-ec','-o','pipefail','mount -t fuse,fuse.keep | awk "(index(\\$3, \\"$CRUNCH_TMP\\") == 1){print \\$3}" | xargs -r -n 1 fusermount -u -z; sleep 1; rm -rf $JOB_WORK $CRUNCH_INSTALL $CRUNCH_TMP/task $CRUNCH_TMP/src* $CRUNCH_TMP/*.cid']
2017-02-28_17:23:41 z8ta6-8i9sb-8mp2qww92moa644 58985  stderr fusermount: failed to unmount /data/crunch-tmp/crunch-job/task/humgen-04-02.9.keep: Invalid argument
2017-02-28_17:23:41 z8ta6-8i9sb-8mp2qww92moa644 58985  stderr srun: error: humgen-04-02: task 0: Exited with exit code 123
2017-02-28_17:23:41 z8ta6-8i9sb-8mp2qww92moa644 58985  clean work dirs: exit 123
2017-02-28_17:23:41 salloc: Relinquishing job allocation 17544
2017-02-28_17:23:41 close failed in file object destructor:
2017-02-28_17:23:41 sys.excepthook is missing
2017-02-28_17:23:41 lost sys.stderr

11209 Tom Clegg (0 hours)
stuck keep fuse mounts not cleared by crunch-job
11377
Honor subtype arg
Tom Clegg
3
11209
3
36
-c-a
5
11378
Warn that most users don't want --unmount-all
Tom Clegg
3
11209
3
36
-c-a
5
11292
Review 11209-unmount-replace
Lucas Di Pentima
375
11209
3
36
-c-a
5
11376
Review 11209-unmount-subtype
Lucas Di Pentima
375
11209
3
36
-c-a
5
11353
use arv-mount --unmount-all in crunch-job
Tom Clegg
3
11209
3
36
-c-a
5
11504
review 11209-crunch-unmount-all
Lucas Di Pentima
375
11209
3
36
-c-a
5
Subject: Reduce amount of parallelism in crunchstat-summary
Tracker ID: Bug
Status: In Progress
Category:
Points: 0.5
Estimation (hours):
Spent Time: 0.0
Remaining (hours):
Assignee: Tom Morris
Project: Arvados
Release:

Currently crunchstat-summary processes all components of a pipeline in parallel. This can mean hundreds of threads all competing for memory and cycles at the same time, leading to memory exhaustion in extreme cases.

We should dial this back to a reasonable number of threads for the machine and workload being processed.

10359 Tom Morris (0 hours)
Reduce amount of parallelism in crunchstat-summary
0.5
10379
Review 10359-crunchstat-summary-serial
Tom Morris
388
10359
2
36
-c-a
5
Subject: [CWL] Intermediary collection handling can be specified
Tracker ID: Feature
Status: In Progress
Category:
Points: 3.0
Estimation (hours):
Spent Time: 0.0
Remaining (hours):
Assignee: Peter Amstutz
Project: Arvados
Release:

Background

Workflows produce a lot of intermediate collections. For production workflows that are rarely re-run, the job reuse benefits are minimal, instead this is just clutter and takes up storage space that the user would rather not pay for. This is also necessary to support a roll-in/roll-out use case where a cluster only has sufficient storage to store a few complete runs and input and output data are transferred from/to somewhere else.

Requirements

Should be able to specify default behavior (retain or trash) but override behavior for output of specific steps.

The final output is always retained. Input should be unaffected.

Intermediate collections need to live as long as they are in use by downstream steps. When intermediate collections are no longer needed by downstream steps, they should be trashed.

Design

arvados-cwl-runner submits container requests; when the container completes a collection is created and reported in output_uuid. Arvados-cwl-runner can then set the trash_at field on the collection.

  • API server
    • Add a "output_ttl" field to container request. This value is in seconds. When the output collection is created for the container request, it should have trash_at and delete_at set now + output_ttl (assume that tokens are issued with expiry times less than trash_at). A value of <= 0 means don't set trash_at.
    • Add tests.
    • Update documentation
  • CWL runner
    • When "intermediate output TTL" is provided, container requests are submitted with output_ttl set
    • Default behavior is output_ttl is None or 0 (to be consistent with current behavior.)
    • When workflow completes successfully, everything marked as intermediate should be trashed immediately. Do not do this on workflow failure.
    • Provide command line option to indicate that things shouldn't be delete immediately
    • Custom Arvados CWL hint to specify treatment of individual step outputs
    • Update documentation
11100 Peter Amstutz (0 hours)
[CWL] Intermediary collection handling can be specified
3.0
11389
Review a-c-r changes
Tom Clegg
3
11100
1
36
-c-a
5
11370
Update cwl runner after API feature is merged
Peter Amstutz
47
11100
1
36
-c-a
5
11372
Add output_ttl field
Tom Clegg
3
11100
3
35
-c-a
5
11388
Review 11100-cr-output-ttl
Peter Amstutz
47
11100
3
36
-c-a
5
Subject: [Workbench][Crunch2] Provenance graph for Container Request
Tracker ID: Story
Status: In Progress
Category:
Points: 2.0
Estimation (hours):
Spent Time: 0.0
Remaining (hours):
Assignee: Lucas Di Pentima
Project: Arvados
Release: Crunch v2

Extract dependency graph from Container Request and pass to existing code which knows how to use GraphViz to format and reuse the rest of the existing infrastructure.

10111 Lucas Di Pentima (0 hours)
[Workbench][Crunch2] Provenance graph for Container Request
2.0
11381
Review 10111-cr-provenance-graph
Lucas Di Pentima
375
10111
2
36
-c-a
5
Subject: [Crunch2] Throttle logs
Tracker ID: Feature
Status: In Progress
Category:
Points: 2.0
Estimation (hours):
Spent Time: 0.0
Remaining (hours):
Assignee: Radhika Chippada
Project: Arvados
Release:

Implement log throttling in crunch-run. Behavior should be equivalent to services/api/lib/crunch_dispatch.rb#rate_limit

Use same configuration parameters as currently in API server config:

  # These two settings control how frequently log events are flushed to the
  # database.  Log lines are buffered until either crunch_log_bytes_per_event
  # has been reached or crunch_log_seconds_between_events has elapsed since
  # the last flush.
  crunch_log_bytes_per_event: 4096
  crunch_log_seconds_between_events: 1

  # The sample period for throttling logs, in seconds.
  crunch_log_throttle_period: 60

  # Maximum number of bytes that job can log over crunch_log_throttle_period
  # before being silenced until the end of the period.
  crunch_log_throttle_bytes: 65536

  # Maximum number of lines that job can log over crunch_log_throttle_period
  # before being silenced until the end of the period.
  crunch_log_throttle_lines: 1024

  # Maximum bytes that may be logged by a single job.  Log bytes that are
  # silenced by throttling are not counted against this total.
  crunch_limit_log_bytes_per_job: 67108864

Above parameters should be published in the discovery document for use by crunch-run.

8019 Radhika Chippada (0 hours)
[Crunch2] Throttle logs
2.0
11478
Review 8019-crunchrun-log-throttle
Peter Amstutz
47
8019
2
36
-c-a
5
Subject: [Workbench] Add ability to disable job reuse when running a workflow
Tracker ID: Story
Status: In Progress
Category:
Points: 1.0
Estimation (hours):
Spent Time: 0.0
Remaining (hours):
Assignee: Lucas Di Pentima
Project: Arvados
Release:
  • Either "re-run container request" button should present a dialog box asking the user if they want job reuse or not, or there should be 2 separate buttons that are clearly labeled
  • Must recognize that a container request for "arvados-cwl-runner" and add "--disable-reuse" to the command field.
11185 Lucas Di Pentima (0 hours)
[Workbench] Add ability to disable job reuse when running a workflow
1.0
11471
Review 11185-wb-disable-reuse
Radhika Chippada
72
11185
1
36
-c-a
5
Subject: [VG] Automated download from Azure / upload to Keep
Tracker ID: Story
Status: New
Category:
Points:
Estimation (hours):
Spent Time: 0.0
Remaining (hours):
Assignee: Tom Morris
Project: Arvados
Release:
11468 Tom Morris (0 hours)
[VG] Automated download from Azure / upload to Keep
Subject: [API] container_requests#update alternately responds 422 or 404 for no apparent reason
Tracker ID: Bug
Status: New
Category: API
Points:
Estimation (hours):
Spent Time: 0.0
Remaining (hours):
Assignee: Tom Clegg
Project: Arvados
Release:

Seems like a race condition -- perhaps related to permission lookups or other container request updates that are happening at the ~same time?

11470 Tom Clegg (0 hours)
[API] container_requests#update alternately responds 422 or 404 for no apparent reason
11479
Review
Peter Amstutz
47
11470
1
36
-c-a
5
11526
Review 11470-update-task-fields
Tom Clegg
3
11470
3
36
-c-a
5
Subject: Containers seem to run more than once, which isn't supposed to happen
Tracker ID: Bug
Status: New
Category: Crunch
Points:
Estimation (hours):
Spent Time: 0.0
Remaining (hours):
Assignee: Tom Clegg
Project: Arvados
Release:

Example: tb05z-dz642-eie1eal1059y9bb

11190 Tom Clegg (0 hours)
Containers seem to run more than once, which isn't supposed to happen
11263
Review
Peter Amstutz
47
11190
1
36
-c-a
5
Subject: [DOC] add cookbook section with code snippets
Tracker ID: Bug
Status: New
Category:
Points: 0.5
Estimation (hours):
Spent Time: 0.0
Remaining (hours):
Assignee: Tom Morris
Project: Arvados
Release:

More examples for api calls with real use cases, e.g. users/links/groups, or collections/projects, i.e. let’s add a cookbook section with code snippets to doc.arvados.org

10349 Tom Morris (0 hours)
[DOC] add cookbook section with code snippets
0.5
10381
Review
Nico César
288
10349
1
36
-c-a
5
Subject: Nginx config should speak JSON when returning its own response for unproxyable API requests
Tracker ID: Bug
Status: New
Category:
Points: 1.0
Estimation (hours):
Spent Time: 0.0
Remaining (hours):
Assignee: Nico César
Project: Arvados
Release:

Also update the install documentation to explain to customers how to do this.

11136 Nico César (0 hours)
Nginx config should speak JSON when returning its own response for unproxyable API requests
1.0
Subject: [Crunch2][Workbench]workflow#show page (similar to pipeline_template#show page)
Tracker ID: Story
Status: In Progress
Category:
Points: 1.0
Estimation (hours):
Spent Time: 0.0
Remaining (hours):
Assignee: Radhika Chippada
Project: Arvados
Release: Crunch v2

Include editable name and description field.
Also should include a "Run this workflow" button.

10112 Radhika Chippada (0 hours)
[Crunch2][Workbench]workflow#show page (similar to pipeline_template#show page)
1.0
11472
Review
Lucas Di Pentima
375
10112
1
36
-c-a
5
Subject: [Documentation] Write a Crunch2 wiki page
Tracker ID: Story
Status: New
Category:
Points: 1.0
Estimation (hours):
Spent Time: 0.0
Remaining (hours):
Assignee: Tom Morris
Project: Arvados
Release:

It should include:

  • Planned features
  • Planned architecture
  • Planned timeline
  • Note that we'll continue to support Arvados JSON pipelines
  • Links to implementation stories
7543 Tom Morris (0 hours)
[Documentation] Write a Crunch2 wiki page
1.0
11390
Review
Tom Clegg
3
7543
1
36
-c-a
5
Subject: [Documentation] Write a Crunch wiki page
Tracker ID: Story
Status: New
Category:
Points: 1.0
Estimation (hours):
Spent Time: 0.0
Remaining (hours):
Assignee: Tom Morris
Project: Arvados
Release:

It should provide a useful overview of the entire Crunch system, just like the Keep page.

7542 Tom Morris (0 hours)
[Documentation] Write a Crunch wiki page
1.0
11391
Review
Tom Clegg
3
7542
1
36
-c-a
5
Subject: Update Pipeline Optimization wiki with CWL/Crunchv2
Tracker ID: Story
Status: New
Category:
Points:
Estimation (hours):
Spent Time: 0.0
Remaining (hours):
Assignee: Tom Morris
Project: Arvados
Release:

update https://dev.arvados.org/projects/arvados/wiki/Pipeline_Optimization with CWL references instead of pipeline templates

11285 Tom Morris (0 hours)
Update Pipeline Optimization wiki with CWL/Crunchv2
11392
Review
Bryan Cosca
189
11285
1
36
-c-a
5
Subject: [DOC] update documentation to list ubuntu1604
Tracker ID: Feature
Status: New
Category:
Points: 0.5
Estimation (hours):
Spent Time: 0.0
Remaining (hours):
Assignee: Tom Clegg
Project: Arvados
Release:
10988 Tom Clegg (0 hours)
[DOC] update documentation to list ubuntu1604
0.5
11482
Review
Peter Amstutz
47
10988
1
36
-c-a
5
Subject: [Nodemanager] Merge cloud_environment support to upstream libcloud
Tracker ID: Story
Status: Resolved
Category:
Points: 1.0
Estimation (hours):
Spent Time: 0.0
Remaining (hours):
Assignee: Peter Amstutz
Project: Arvados
Release:

Fix tests so this pull request can be merged:

https://github.com/apache/libcloud/pull/969

11350 Peter Amstutz (0 hours)
[Nodemanager] Merge cloud_environment support to upstream libcloud
1.0
11484
Pass libcloud test suite
Peter Amstutz
47
11350
3
36
-c-a
5
Subject: [Crunch2][Workbench] Improve formatting of /workflow page
Tracker ID: Story
Status: Resolved
Category:
Points: 1.0
Estimation (hours):
Spent Time: 0.0
Remaining (hours):
Assignee: Radhika Chippada
Project: Arvados
Release:

Remove non-functional checkbox
Add Run button
Remove definition
Make sure that paging or infinite scrolling work (but if one does, no work required)

Possible future enhancement: # of steps and names of first few steps

11450 Radhika Chippada (0 hours)
[Crunch2][Workbench] Improve formatting of /workflow page
1.0
11477
Review 11450-workflows-page
Radhika Chippada
72
11450
3
36
-c-a
5
Subject: SSO packages for Centos7 don't build
Tracker ID: Bug
Status: Resolved
Category:
Points:
Estimation (hours):
Spent Time: 0.0
Remaining (hours):
Assignee: Tom Clegg
Project: Arvados
Release:

https://ci.curoverse.com/job/build-packages-sso/38/console

Error: /arvados/=/var/www/arvados-sso/current: Unable to figure out package name from fpm results:

{:timestamp=>"2017-04-12T04:21:55.561577+0000", :message=>"Cannot copy file, the destination path is probably a directory and I attempted to write a file.", :path=>"/var/www/arvados-sso/current/./LICENCE", :staging=>"/tmp/package-dir-staging20170412-1012-3839k5", :level=>:error, "method"=>"input"} {:timestamp=>"2017-04-12T04:22:00.044616+0000", :message=>"Process failed: rpmbuild failed (exit code 1). Full command was:[\"rpmbuild\", \"-bb\", \"--define\", \"buildroot /tmp/package-rpm-build20170412-1012-27awtk/BUILD\", \"--define\", \"_topdir /tmp/package-rpm-build20170412-1012-27awtk\", \"--define\", \"_sourcedir /tmp/package-rpm-build20170412-1012-27awtk\", \"--define\", \"_rpmdir /tmp/package-rpm-build20170412-1012-27awtk/RPMS\", \"--define\", \"_tmppath /tmp\", \"/tmp/package-rpm-build20170412-1012-27awtk/SPECS/arvados-sso-server.spec\"]", :level=>:error}

fpm --maintainer=Ward Vandewege <ward@curoverse.com> -s dir -t rpm -n arvados-sso-server --vendor Curoverse, Inc. -v 0.1.20170412042041.5165596 --iteration 1 --depends postgresql-devel --exclude var/www/arvados-sso/current/.bundle/ --exclude var/www/arvados-sso/current/packages/ --exclude var/www/arvados-sso/current/vendor/cache/ --iteration= --after-install /tmp/arvados-sso-server-CddFWG0X.scripts/postinst --before-remove /tmp/arvados-sso-server-CddFWG0X.scripts/prerm --after-remove /tmp/arvados-sso-server-CddFWG0X.scripts/postrm -x var/www/arvados-sso/current/.git -x var/www/arvados-sso/current/packages -x var/www/arvados-sso/current/tmp -x var/www/arvados-sso/current/log -x var/www/arvados-sso/current/coverage -x var/www/arvados-sso/current/Capfile* -x var/www/arvados-sso/current/config/deploy* -x var/www/arvados-sso/current/config/application.yml -x var/www/arvados-sso/current/config/database.yml -x var/www/arvados-sso/current/vendor/bundle --url=https://arvados.org --description=Arvados SSO server - Arvados is a free and open source platform for big data science. --license=Expat license /arvados/LICENCE=/var/www/arvados-sso/current/LICENCE /arvados/=/var/www/arvados-sso/current

ERROR: build packages on arvados/build:centos7 failed with exit status 1
11459 Tom Clegg (0 hours)
SSO packages for Centos7 don't build
11481
Review
Peter Amstutz
47
11459
1
36
-c-a
5
Subject: [Keep-web] Support CORS requests with Range headers
Tracker ID: Bug
Status: Resolved
Category: Keep
Points:
Estimation (hours):
Spent Time: 0.0
Remaining (hours):
Assignee: Tom Clegg
Project: Arvados
Release:

Background

The Workbench log viewer uses an ajax request to retrieve log data. It uses the POST method so it can include the api_token in the body. If the log is larger than the configured limit (log_viewer_max_bytes), it also adds a Range header.

Problem

Range is not a "safe" header for CORS, so the browser performs a pre-flight OPTIONS request, to which keep-web responds 405, so the request fails.

Solution

keep-web should respond to OPTIONS requests with 200 status and CORS headers:
  • Access-Control-Allow-Origin: *
  • Access-Control-Max-Age: 86400
  • Access-Control-Allow-Headers: Range
  • Access-Control-Allow-Methods: GET, POST
11509 Tom Clegg (0 hours)
[Keep-web] Support CORS requests with Range headers
11512
Review 11509-keep-web-cors-range
Tom Clegg
3
11509
3
36
-c-a
5
Subject: [Workbench] In collection#show page, display a tooltip on disabled trash icon, disabled pencil icon, and disabled upload tab. 
Tracker ID: Story
Status: Resolved
Category:
Points:
Estimation (hours):
Spent Time: 0.0
Remaining (hours):
Assignee: Radhika Chippada
Project: Arvados
Release:
11465 Radhika Chippada (0 hours)
[Workbench] In collection#show page, display a tooltip on disabled trash icon, disabled pencil icon, and disabled upload tab. 
11476
Review 11465-disabled-collection-file-tooltips
Radhika Chippada
72
11465
3
36
-c-a
5
Subject: Flaky test on sdk/python/tests/test_arv_get.py
Tracker ID: Bug
Status: Resolved
Category: SDKs
Points:
Estimation (hours):
Spent Time: 0.0
Remaining (hours):
Assignee: Lucas Di Pentima
Project: Arvados
Release:
======================================================================
FAIL: test_get_collection_manifest (tests.test_arv_get.ArvadosGetTestCase)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/data/1/jenkins/workspace/run-tests-remainder/sdk/python/tests/test_arv_get.py", line 81, in test_get_collection_manifest
    self.assertEqual(self.col_manifest, f.read())
AssertionError: u'. 37b51d194a7513e45b56f6524f2d51f2+3+A9a83a0d33abbdec368acb632ec53e8186758bb3f@5903577a acbd18db4cc2f85cedef654fccc4a4d8+3+Ac1a9a2e550cfe57511b2e70c4dfa1d05a23b75e8@5903577a 0:3:bar.txt 3:3:foo.txt\n./subdir 73feffa4b7f6bb68e44cf984c85f6e88+3+A732d02797529f59e880374c7c91d81be527f3307@5903577a 0:3:baz.txt\n' != '. 37b51d194a7513e45b56f6524f2d51f2+3+A0d0f09d1119414a4ae7d56226f7cc79cd84b834f@5903577b acbd18db4cc2f85cedef654fccc4a4d8+3+A0afbb90b13fb7ccfd24ed7ce5923308dcd9863bb@5903577b 0:3:bar.txt 3:3:foo.txt\n./subdir 73feffa4b7f6bb68e44cf984c85f6e88+3+Aeb981cdb15a3ab7330ff42673b5acd6a92e5489a@5903577b 0:3:baz.txt\n'
11502 Lucas Di Pentima (0 hours)
Flaky test on sdk/python/tests/test_arv_get.py
11506
Review 11502-unstripped-manifest-fix
Lucas Di Pentima
375
11502
3
36
-c-a
5
Subject: [Crunch] Socket timed out on send/recv operation causes pipeline failure
Tracker ID: Bug
Status: Resolved
Category: Crunch
Points:
Estimation (hours):
Spent Time: 0.0
Remaining (hours):
Assignee: Peter Amstutz
Project: Arvados
Release:

This pipeline uses one_task_per_input_file. There are over 800 input files. Each task takes about 20 minutes to run. Soon after one of the tasks finish successfully, this error occurs:

...
Tue Oct 7 17:02:42 2014 9tee4-8i9sb-blb48qaou8uatsi 8876 21 stderr srun: error: slurm_receive_msgs: Socket timed out on send/recv operation
Tue Oct 7 17:02:42 2014 9tee4-8i9sb-blb48qaou8uatsi 8876 21 stderr srun: error: Task launch for 98.24 failed on node compute0: Socket timed out on send/recv operation
Tue Oct 7 17:02:42 2014 9tee4-8i9sb-blb48qaou8uatsi 8876 21 stderr srun: error: Application launch failed: Socket timed out on send/recv operation

Seen on 9tee4 on pipeline instance: https://workbench.9tee4.arvadosapi.com/pipeline_instances/9tee4-d1hrv-5zc2cqp2yzhcmd8

4124 Peter Amstutz (0 hours)
[Crunch] Socket timed out on send/recv operation causes pipeline failure
11483
Review
Tom Clegg
3
4124
1
36
-c-a
5
11485
Review
4124
5
36
-c-a
5
1
-c-a
1
impediments
-c-a
April 26, 2017 13:41:46.04767298698425293 +0000