CLI client

Arvados exposes lots of functionality via REST/HTTP. Most of them deserve to be exposed through a command line tool that is easy to use interactively and in shell scripts.

Background / current status

Currently there is an "arv" command line tool in source:sdk/cli (packaged as a Ruby gem and deb/rpm). It exposes REST APIs (as found in the discovery document) generically by mapping API names and parameters to command line arguments. It also knows how to invoke some Python programs (e.g., "arv put" passes through to "arv-put") which have their own calling syntax.

  • Awkward interface: arv job list --filters '[["script","in",["foo","bar"]]]'
  • Lots of conventions: arv collection get --uuid FOO vs. arv get FOO vs. arv-get FOO (download files), arv edit FOO vs. arv collection update --uuid FOO --collection '{...}'
  • Annoying to install (especially on older systems) because of Ruby requirements, and because both Ruby and Python pieces are needed to get a full set
  • Slow to start up (partly because of Bundler?)

Desired scope

Command line tool should be able to
  • Display/list Arvados objects in human- and machine-readable form
  • Create/edit all Arvados objects in a generic way (current "arv edit" is pretty good)
  • Perform common actions in a natural way ("arv cancel JOB_UUID")
  • Watch job logs in real time ("arv tail -f JOB_OR_PIPELINE_UUID"?)
  • Transfer files/dirs between the local filesystem and Arvados collections (Keep)
  • Expose collection data (and other Arvados objects) as a local filesystem through a FUSE mount
  • Search/get objects across multiple sites, similar to workbench multi-site search
  • ...more features go here

Desired interface (patterns)

As with git, everything should be accessible through an "arv" front door. Some commands might result in forking other binaries like "arv-get" but this is an implementation detail that can change over time, not part of the public API.

Calling syntax should look familiar/normal to people who use related tools like Docker. In many cases one should be able to take a non-Arvados command, prefix it with the word "arv", and have it do the analogous thing with Arvados: e.g., "arv less PDH/file.txt" or "arv rm UUID".

Server components should also be accessible the same way. This could be implemented by having "arv" invoke a separately packaged "arvados-server" binary at runtime, or by shipping an alternate "arv" binary that also has the server components embedded. The server-included binary, in addition to being larger, will be less distributable because AGPL.

(More rules/patterns go here)

Desired interface (specifics)

(Specific commands go here)


This should be done in Go. Considering portability, performance, stability, and development speed, Python and Ruby just don't stand a chance.

Updated by Tom Clegg over 6 years ago · 4 revisions