Project

General

Profile

Actions

Feature #7751

closed

[Crunch] [SDKs] [FUSE] Convenient way to write job output to Keep via writable arv-mount, as an alternative to staging output on scratch and then copying when finished.

Added by Tom Clegg over 8 years ago. Updated over 8 years ago.

Status:
Resolved
Priority:
Normal
Assigned To:
Category:
SDKs
Target version:
Story points:
0.0

Description

It is already possible for a crunch program to do this: start arv-mount in writable mode, write files into a new directory, and use the resulting PDH as the task output.

This story makes it convenient to do this, i.e., the crunch script itself shouldn't need to do anything more complicated than this:

outputdir = arvados.crunch.task_output_dir()

with open(os.path.join(outputdir.path, 'foo'), 'w') as f:
    f.write('foo')

arvados.current_task().set_output(outputdir)
# or perhaps just: outputdir.save()
Possible implementation approach:
  • crunch-job sets up a writable fuse mount for every job task (but if the job doesn't do anything with it, nothing gets written; and it does not include any read or write access to existing collections beyond the by-PDH access already needed by jobs)
  • add SDK functions that figure out (by looking at environment vars, etc.) where the output directory is supposed to go; push arv-mount's magic buttons1 to get the PDH of the finished collection; and set the task output to that PDH.

1 Read JSON from {dir}/.arvados#collection


Subtasks 7 (0 open7 closed)

Task #7784: Review 7751-mount-tmpResolvedPeter Amstutz11/23/2015Actions
Task #7792: CLI argument to add writable tmp collection to magic dirResolvedTom Clegg11/16/2015Actions
Task #7827: TestsResolvedTom Clegg11/16/2015Actions
Task #7865: Make .arvados#collection work reliably in jenkinsResolvedTom Clegg11/16/2015Actions
Task #7793: Python SDK helpersResolvedTom Clegg11/26/2015Actions
Task #7872: Review 7751-crunch-fuse-outputResolvedTom Clegg11/16/2015Actions
Task #7873: Try new crunch-job on stagingResolvedTom Clegg11/16/2015Actions

Related issues

Blocks Arvados - Feature #7847: [SDKs] Update run-command to use arv-mount --mount-tmp instead of staging directoryRejectedActions
Actions

Also available in: Atom PDF