Project

General

Profile

Upgrading to Containers API » History » Revision 3

Revision 2 (Peter Amstutz, 10/17/2018 06:51 PM) → Revision 3/4 (Peter Amstutz, 10/17/2018 06:52 PM)

h1. Upgrading to Containers API 

 The "containers" API is the recommended way to submit compute work to Arvados.    It supersedes the "jobs" API, which is deprecated. 

 h2. Benefits over the "jobs" API 

 * Simpler and more robust execution with fewer points of failure 
 * Automatic retry for containers that fail to run to completion due to infrastructure errors 
 * Scales to thousands of simultaneous containers 
 * Able to support alternate schedulers/dispatchers in addition to slurm 
 * Improved logging, different streams logs/metrics stored in different files in the log collection 
 * Records more upfront detail about the compute node, and additional metrics (such as available disk space over the course of the container run) 
 * Better behavior when deciding whether to reuse past work -- pick the oldest container that matches the criteria 
 * Can reuse running containers between workflows, cancelling a workflow will not cancel containers that are shared with other workflows 
 * Supports setting time-to-live on intermediate output collections for automatic cleanup 
 * Supports "secret" inputs, suitable for passwords or access tokens, which are hidden from the API responses and logs, and forgotten after use 
 * Does not require "git" for dispatching work 

 h2. Differences from the "jobs" API 

 Containers cannot reuse jobs (but can reuse other containers) 

 Must run "crunch-dispatch-slurm":https://doc.arvados.org/install/crunch2-slurm/install-dispatch.html instead of @crunch-dispatch.rb@ 

 Containers cannot be run from pipeline templates (there is a new "Workflow" type) 

 APIs are incompatible, code which integrates with the "jobs" API must be updated to work with containers 

 Containers have network access disabled by default 

 The keep mount only exposes collections which are explicitly listed as inputs 

 h2. Migrating to "containers" API 

 Run your workflows using @arvados-cwl-runner --api=containers@ (only necessary if both the jobs and containers APIs are enabled, if the jobs API is disabled, it will use the containers API automatically) 

 Register your workflows so they can be run from workbench using @arvados-cwl-runner --api=containers --create-workflow@ 

 Read "Migrating running CWL on jobs API to containers API":https://doc.arvados.org/user/cwl/cwl-style.html#migrate 

 Use @arv:APIRequirement: {}@ in the @requirements@ section of your CWL file to enable network access for the container (see "Arvados CWL Extensions":https://doc.arvados.org/user/cwl/cwl-extensions.html) 

 For examples on how to manage container requests with the Python SDK, see https://doc.arvados.org/sdk/python/cookbook.html