Crunch1-in-Crunch2 (DRAFT)

Detail about how Crunch2 runs jobs that were written for Crunch1.

See

Background

In order for Crunch2 to replace Crunch1, Crunch2 must:
  • run jobs that rely on Crunch1's API, like
    • run-command
    • arv-run (via run-command)
    • existing tutorial/example jobs
    • user scripts based on existing tutorials
  • accept job submissions from clients using the Crunch1 API, like
    • arv-run-pipeline-instance
    • user scripts
  • maintain the ability to view progress of Crunch1 jobs using Crunch1 clients, like
    • Workbench
    • arv-run-pipeline-instance
Crunch1 jobs rely on the following pieces:
  • Keep mount available within the container
  • Some environment variables (CRUNCH_SRC, ARVADOS_API_*, etc)
  • jobs and job_tasks APIs for executing work on multiple nodes

Approach

Submitting a job

Translate the incoming Crunch1 job submission to a Crunch2 job request.
  • The container/command given in the job request are determined by the server configuration. The Crunch1 API doesn't specify [which version of] crunch-job is to be used.

Create the job request using the JobRequests controller.

Create a job record just as before, but set a flag so crunch-dispatch doesn't try to run it. (This could be implemented as a "Proxy" state.)

Running a job

Once it has been translated to a job request, a Crunch1 job is merely a Crunch2 job (the "parent") which acts as any "workflow runner" would: it submits additional job requests of its own (the "children"). Its notable difference is that it uses an additional communication channel not normally used by Crunch2 jobs:
  • The children perform Arvados API requests (jobs.get, job_tasks.get, job_tasks.update, and job_tasks.create) to get information about themselves and to ask the parent to submit more job requests.
  • The parent performs Arvados API requests (presumably job_tasks.list and job_tasks.get) to get the information submitted by the children.
The Crunch1 runner implements the same algorithm as crunch-job, but with a few simplifying restrictions.
  • It has only one way to run tasks: submit a jobrequest1.
  • It doesn't construct docker command lines, or run docker itself: instead, it writes Crunch2 job requests.
  • It doesn't retry tasks. Crunch2 is responsible for this.
  • It doesn't look for node failure. Ditto Crunch2.
  • It doesn't copy stderr to Keep. Ditto Crunch2.
  • It doesn't know anything about slurm.
With all that stuff removed, the Crunch1 runner algorithm reduces to something like this:
  • Submit a job_request for "task 0".
  • When the assigned job succeeds, look for new job_tasks that it submitted. Add these to a list of "pending" tasks.
  • Take min(sequence) across all pending job_tasks. Translate job_tasks with that sequence out of "pending" and submit them as job_requests.
  • Repeat until all submitted job_requests have been assigned and finished, and "pending" is empty.
  • Collate task outputs into a job output.
TBD:
  • If a child job (formerly "job_task") sets the parent job's (formerly "job's") output attribute, it cannot be reused to fulfill a future job request. Either this should be handled transparently, or this use case should be prohibited (at the cost of breaking some Crunch1 jobs).
  • If a child job reads the parent job record (which is nearly universal among Crunch1 jobs) it cannot be reused to fulfill a future job request except where the future job request would return the same values. This could be ensured by copying the parent job's crunch1 job record into the crunch2 job request's inputs -- however, this would effectively prevent any crunch1 job from reusing tasks across non-identical jobs.

1 This means "local dev jobs" will require a dev/transient install of the Crunch infrastructure. This is probably a good thing overall, but does mean we need to do the work of making the transient infrastructure spring up quickly and easily.

Getting job status

Clients must be able to get the current status of a Crunch1 job (i.e., one that was submitted with the Crunch1 API) by using the Crunch1 "list" and "get" APIs. This is necessary for existing clients (including Crunch1 jobs themselves) to continue working without modifications after Crunch2 has replaced Crunch1 as the execution engine.

Clients must be able to get job status for both Crunch1- and Crunch2-submitted jobs using only the Crunch2 "list" and "get" APIs. This makes it possible to migrate Workbench from Crunch1 to Crunch2 without losing the ability to see old jobs.

However, it is not necessary for Crunch1 clients to see Crunch2 jobs.

When a job is/was executed by Crunch2, the Crunch2 API is the source of truth about its state. Therefore:
  • Crunch1 APIs that modify a job must also modify the corresponding Crunch2 record(s). This might be the empty set, though: crunch-job's replacement will use Crunch2 directly rather than using Crunch1's jobs.update API to update job output/progress, for example.
  • Crunch1 APIs that retrieve a job must read the Crunch2 record.

job_tasks APIs

The job_tasks APIs are used by Crunch1 jobs to communicate between crunch-job and the processes it runs on allocated nodes. The API server doesn't need to touch these.