Project

General

Profile

Actions

Upgrading to Containers API

The "containers" API is the recommended way to submit compute work to Arvados. It supersedes the "jobs" API, which is deprecated.

Benefits over the "jobs" API

  • Simpler and more robust execution with fewer points of failure
  • Automatic retry for containers that fail to run to completion due to infrastructure errors
  • Scales to thousands of simultaneous containers
  • Able to support alternate schedulers/dispatchers in addition to slurm
  • Improved logging, different streams logs/metrics stored in different files in the log collection
  • Records more upfront detail about the compute node, and additional metrics (such as available disk space over the course of the container run)
  • Better behavior when deciding whether to reuse past work -- pick the oldest container that matches the criteria
  • Can reuse running containers between workflows, cancelling a workflow will not cancel containers that are shared with other workflows
  • Supports setting time-to-live on intermediate output collections for automatic cleanup
  • Supports "secret" inputs, suitable for passwords or access tokens, which are hidden from the API responses and logs, and forgotten after use
  • Does not require "git" for dispatching work

Differences from the "jobs" API

Containers cannot reuse jobs (but can reuse other containers)

Uses the service crunch-dispatch-slurm instead of crunch-dispatch.rb

Non-CWL Arvados "pipeline templates" are not supported with containers. Pipeline templates should be rewritten in CWL and registered as "Workflows".

The containers APIs is incompatible with the jobs API, code which integrates with the "jobs" API must be updated to work with containers

Containers have network access disabled by default

The keep mount only exposes collections which are explicitly listed as inputs

Migrating to "containers" API

Run your workflows using arvados-cwl-runner --api=containers (only necessary if both the jobs and containers APIs are enabled, if the jobs API is disabled, it will use the containers API automatically)

Register your workflows so they can be run from workbench using arvados-cwl-runner --api=containers --create-workflow

Read Migrating running CWL on jobs API to containers API

Use arv:APIRequirement: {} in the requirements section of your CWL file to enable network access for the container (see Arvados CWL Extensions)

For examples on how to manage container requests with the Python SDK, see https://doc.arvados.org/sdk/python/cookbook.html

Updated by Peter Amstutz about 6 years ago ยท 4 revisions