Hacking API server

Source tree layout

Everything is in /services/api.

Key pieces to know about before going much further:

/ Usual Rails project layout
/app/controllers/application_controller.rb Controller superclass with most of the generic API features like CRUD, authentication
/app/controllers/arvados/v1/ API methods other than generic CRUD (users#current, jobs#queue, ...)
/app/models/arvados_model.rb Default Arvados model behavior: permissions, etag, uuid

Unlike a typical Rails project...

  • Most responses are JSON. Very few HTML views. We don't normally talk to browsers, except during authentication.
  • We assign UUID strings (see lib/assign_uuid.rb and app/models/arvados_model.rb)
  • The Links table emulates a graph database a la RDF. Much of the interesting information in Arvados is recorded as a Link between two other entities.
  • For the most part, relations among objects are not expressed with the usual ActiveRelation features like belongs_to and has_many.
  • Permissions: see below.

Running in development mode

First, take care of the dependencies documented at http://doc.arvados.org/install/install-api-server.html.

Save something like this at ~/bin/apiserver, make it executable, make sure ~/bin is in your path:

#!/bin/sh
set -e
cd ~/arvados/services/api
if ! [ -e self-signed.key ]
then
  # Generate a self-signed SSL key
  openssl req -new -x509 -nodes -out ./self-signed.pem -keyout ./self-signed.key -days 3650 -subj /CN=localhost
fi
if [ -e /usr/local/rvm/bin/rvm ]
then
  rvmexec="rvm-exec 2.1.1" 
else
  rvmexec="" 
fi
export ARVADOS_WEBSOCKETS=true
export RAILS_ENV=development
$rvmexec bundle install
exec $rvmexec bundle exec passenger start -p3030 --ssl --ssl-certificate self-signed.pem --ssl-certificate-key self-signed.key
Notes:
  • Here we use passenger instead of webrick (which is what we'd get with "rails server") in order to serve websockets from the same process as https. (You also have the option of serving https and wss from separate processes -- see services/api/config/application.default.yml -- but the simplest way is to run both in the same process by setting ARVADOS_WEBSOCKETS=true.)
  • Webrick can make its own self-signed SSL certificate, but passenger expects you to provide a certificate & key yourself. The above script generates a key/certificate pair the first time it runs, and leaves it in services/api/self-signed.* to reuse next time.
  • bundle install ensures your installed gems satisfy the requirements in Gemfile and (if you have changed it locally) update Gemfile.lock appropriately. This should do the right thing after you change Gemfile yourself, or change Gemfile/Gemfile.lock by updating/merging master.
  • If you're relying on rvm to provide a suitable version of Ruby, "rvm-exec" should do the right thing here. You can change the 2.1.1 argument to the version you want to use (hopefully >= 2.1.1). If your system Ruby is >= 2.1.1, rvm is unnecessary.
  • You can kill the server by running
passenger stop --pid-file ~/arvados/services/api/tmp/pids/passenger.3030.pid

Headaches to avoid

If you make a change that affects the discovery document, you need to clear a few caches before your client will see the change.
  • Restart API server or: touch tmp/restart.txt
  • Clear API server disk cache: rake tmp:cache:clear
  • Clear SDK discovery doc cache on client side: rm -r ~/.cache/arvados/
Do not store symbol keys (or values) in serialized attributes.
  • Rails supplies params as a HashWithIndifferentAccess so params['foo'] and params[:foo] are equivalent. This is usually convenient. However, here we often copy arrays and hashes from params to the database, and from there to API responses. JSON does not have HashWithIndifferentAccess (or symbols) and we want these serialized attributes to behave predictably everywhere.
  • API server's policy is that serialized attributes (like properties on a link) always have strings instead of symbols: these attributes look the same in the database, in the API server Rails application, in the JSON response sent to clients, and in the JSON objects received from clients.
  • There is no validation (yet!) to check for this.

When script/crunch-dispatch.rb invokes arv-run-pipeline-instance and crunch-job, it uses the version of arvados-cli specified in Gemfile.lock. Use bundle update arvados-cli to update Gemfile.lock to use the latest versions.

Features

Authentication

Involves
  • UserSessionsController (in app/controllers/, not .../arvados/v1): this is an exceptional case where we actually talk to a browser.

Permissions

Object-level permissions, aka ownership and sharing
  • Writing
    • Models have their own idea of create/update permissions. Controllers don't worry about this.
    • ArvadosModel updates/enforces modified_by_* and owner_uuid
  • Reading
    • Lookups are not (yet) permission-restricted in the default scope, i.e., when calling Model.where(foo: 'bar').
    • Controllers need to use Model.readable_by(current_user) when appropriate.
    • The other most important permission method is User#groups_i_can(verb). For example, user_object.groups_i_can(:write) returns an array of UUIDs of groups (including projects and other kinds of groups) where user_object has write permission. (For example, readable_by uses this to determine which values of owner_uuid and permission link tail_uuid could establish permission on a given database record for the user in question.)
  • ApplicationController uses an around_filter that verifies the supplied api_token and makes current_user available everywhere. If you need to override create/update permissions, use act_as_system_user do ... end.
  • Unusual cases: KeepDisks and Collections can be looked up by inactive users (otherwise they wouldn't be able to read & clickthrough user agreements).
Controller-level permissions
  • ApplicationController#require_auth_scope_all checks token scopes: currently, unless otherwise specified by a subclass controller, nothing is allowed unless scopes includes "all".
  • ApplicationController has an admin_required filter available (not used by default)

Error handling

  • "Look up object by uuid, and send 404 if not found" is enabled by default, except for index/create actions.

Routing

  • API routes are in the :arvados:v1 namespace.
  • Routes like /jobs/queue have to come before resources :jobs (otherwise /jobs/queue will match jobs#get(id=queue) first). (Better, we should rearrange these to use resources :jobs do ... like in Workbench.)
  • We use the standard Rails routes like /jobs/:id but then we move params[:id] to params[:uuid] in our before_filters.

Tests

  • Run tests with rvm-exec 2.1.1 bundle exec rake test
  • If prompted, migrate your test database by running RAILS_ENV=test rvm-exec 2.1.1 bundle exec rake db:migrate
  • As above, you can leave out rvm-exec 2.1.1 if your system Ruby version is suitable. But don't leave out bundle exec.
  • Run just the unit tests with [...] rake test:units (or test:functionals or test:integration).
  • Run just a single test class (file) by specifying the file, like [...] rake TEST=test/unit/owner_test.rb (save time in your "did that fix the failing test?" phase!)
  • Functional tests need to authenticate themselves with authorize_with :active (where :active refers to an ApiClientAuthorization fixture)
  • There is a deficit of tests, especially unit tests. This is a bug! It doesn't mean we don't want to test things.

Discovery document

  • Mostly, but not yet completely, generated by introspection (descendants of ArvadosModel are inspected at run time). But some controllers/actions are skipped, and some actions are renamed (e.g., Rails calls it "show" but everyone else calls it "get").
  • Handled by Arvados::V1::SchemaController#index (used to be in #discovery_document before #1750). See config/routes.rb
  • Must be available to anonymous clients.
  • Has no tests! We test it by trying all of our SDKs against it.

Development patterns

Add a model

In shell:
  • rails g model FizzBuzz
In app/models/fizzbuzz.rb:
  • Change base class from ActiveRecord::Base to ArvadosModel.
  • Add some more standard behavior.
include HasUuid
include KindAndEtag
include CommonApiTemplate
In db/migrate/{timestamp}_create_fizzbuzzes.rb:
  • Add the generic attribute columns.
  • Run t.timestamps and add (at least!) a :uuid index.
class CreateFizzBuzz < ActiveRecord::Migration
  def change
    create_table :fizzbuzzes do |t|
      t.string :uuid, :null => false
      t.string :owner_uuid, :null => false
      t.string :modified_by_client_uuid
      t.string :modified_by_user_uuid
      t.datetime :modified_at
      t.text :properties

      t.timestamps
    end
    add_index :humans, :uuid, :unique => true
  end
end
Apply the migration:
  • rake db:migrate
  • RAILS_ENV=test rake db:migrate (to migrate your test database too)
  • Inspect the resulting db/schema.rb and include it in your commit.
  • Don't forget to git add the new migration and model files.

Add an attribute to a model

  • Generate migration as usual
    rails g migration AddBazQuxToFooBar baz_qux:column_type_goes_here
    
  • Consider adding null constraints and a default value to the add_column statement in the migration in db/migrate/timestamp_add_baz_qux_to_foo_bar.rb:
    , null: false, default: false
  • Consider adding an index
  • You probably want to add it to the API response template(s) so clients can see it: app/models/model_name.rbapi_accessible :user ...
  • Sometimes it's only visible to privileged users; see ping_secret in app/models/keep_disk.rb
  • If it's a serialized attribute, add serialize :the_attribute_name, Hash to the model. Always specify Hash or Array!
  • Run rake db:migrate and inspect your db/schema.rb and include the new schema.rb in the same commit as your db/migrate/*.rb migration script.
  • Run rake tmp:cache:clear and touch tmp/restart.txt in your dev apiserver, to force it to generate a new REST discovery document.

Add a controller

  • rails g controller Arvados::V1::FizzBuzzes
  • Avoid adding top-level controllers like app/controllers/fizz_buzzes_controller.rb.
  • Avoid adding top-level routes. Everything should be in namespace :arvadosnamespace :v1 except oddballs like login/logout actions.

Add a controller action

Add a route in config/routes.rb.
  • Choose an appropriate HTTP method: GET has no side effects. POST creates something. PUT replaces/updates something.
  • Use the block form:
    resources :fizz_buzzes do
      # If the action operates on an object, i.e., a uuid is required,
      # this generates a route /arvados/v1/fizz_buzzes/{uuid}/blurfl
      post 'blurfl', on: :member
      # If not, this generates a route /arvados/v1/fizz_buzzes/flurbl
      get 'flurbl', on: :collection
    end
    

In app/controllers/arvados/v1/fizz_buzzes_controller.rb:

  • Add a method to the controller class.
  • Skip the "find_object" before_filters if it's a collection action.
  • Specify required/optional parameters using a class method _action_requires_parameters.
    skip_before_filter :find_object_by_uuid, only: [:flurbl]
    skip_before_filter :render_404_if_no_object, only: [:flurbl]
    
    def blurfl
      @object.do_whatever_blurfl_does!
      show
    end
    
    def self._flurbl_requires_parameters
      {
        qux: { type: 'integer', required: true, description: 'First flurbl qux must match this qux.' }
      }
    end
    def flurbl
      @object = model_class.where('qux = ?', params[:qux]).first
      show
    end
    

Add a configuration parameter

  • Add it to config/application.default.yml with a sensible default value. (Don't fall back to default values at time of use, or define defaults in other places!)
  • If there is no sensible default value, like secret_token: specify ~ (i.e., nil) in application.default.yml and put a default value in the test section of config/application.yml.example that will make tests pass.
  • If there is a sensible default value for development/test but not for production, like return address for notification email messages, specify the test/dev default in the common section application.default.yml but specify ~ (nil) in the production section. This prevents someone from installing or updating a production server with defaults that don't make sense in production!
  • Use Rails.configuration.config_setting_name to retrieve the configured value. There is no need to check whether it is nil or missing: in those cases, "rake config:check" would have failed and the application would have refused to start.

Add a test fixture

Generate last part of uuid from command line:

ruby -e 'puts rand(2**512).to_s(36)[0..14]'
j0wqrlny07k1u12

Generate uuid from rails console:
Group.generate_uuid
=> "xyzzy-j7d0g-8nw4r6gnnkixw1i" 

Database notes

uuids are made up of three parts separated by `-` characters: a system prefix (defined the configuration as uuid_prefix), a class prefix (generated by digesting the ruby Class: https://github.com/curoverse/arvados/blob/master/services/api/lib/has_uuid.rb#L16), and a random string. From the rails console, you can get the current value of the class prefix (second part of the uuid) via `.uuid_prefix`. For example:

Group.uuid_prefix
-> "j7d0g"