Arvados Update: Crunch, Docker and Go
New Arvados features for provenance, resource management, and writing pipelines in Go.
Dear Arvados users:
As we approach a 1.0 release for Arvados we've been improving key features for data provenance.
Building on our success integrating Docker into Arvados, we've implemented some incredibly valuable provenance features around Docker. A new command-line tool uploads Docker images to Keep, and Arvados records the full system image used to run a job. This makes it possible to reproduce the entire operating environment used to produce a particular result -- not just your code for that job, but all of the system libraries and tools that were installed along with it -- and helps improve reproducibility for complicated pipelines.
On that note, we've also added a longstanding goal: informaticians can now specify minimum resource requirements for jobs they run. When running a computationally intensive job, or one that requires a lot of scratch disk space, it's very frustrating to launch a pipeline only to watch jobs run slowly or fail because of CPU or disk limitations on compute nodes. It's now possible to specify runtime job constraints, including minimum amounts of disk space, RAM or CPU cores per compute node, and Arvados will ensure that your jobs run only on nodes that are sufficiently powerful to accommodate them.
Here at Curoverse we're very enthusiastic about the Go programming language. We've rewritten the Keep file server in Go and added a Keep proxy server. Now we've added a Go SDK to the Arvados toolkit, so you can write Arvados pipelines in Go as well! We hope you'll give it a try and see how much fun writing Go is!In addition to those features, we have lots of other goodies for you, including:
- Significantly improved Workbench display performance
- Better interface for picking collections and pipeline templates
- Real-time log display via CLI