Hacking Workbench

Source tree layout

Everything is in /apps/workbench.

Key pieces to know about before going much further:

/ Usual Rails project layout
/app/controllers/application_controller.rb Controller superclass with authentication setup, error handling, and generic CRUD actions
/app/controllers/*.rb Actions other than generic CRUD (users#activity, jobs#generate_provenance, ...)
/app/models/arvados_base.rb Default Arvados model behavior and ActiveRecord-like accessors and introspection features
/app/models/arvados_resource_list.rb ActiveRelation-like class (what you get from Model.where() etc.)

Background resources

Workbench is a Rails 4 application. Javascript readings Angular readings

Unlike a typical Rails project...

  • ActiveRecord in Workbench doesn't talk to the database directly, but instead queries the Arvados API as REST client.
  • The Arvados query API is somewhat limited and doesn't accept SQL statements, so Workbench has to work harder to get what it needs.
  • Workbench itself only has the privileges of the Workbench user: when making Arvados API calls, it uses the API token provided by the user.

Unlike what you might expect...

  • Workbench doesn't use the Ruby SDK. It uses a sort of baked-in Rails SDK.
    • TODO: move it out of Workbench into a gem.
    • TODO: use the Ruby SDK under the hood.

Running in development mode

SSL certificates

You can get started quickly with SSL by generating a self-signed certificate:

openssl req -new -x509 -nodes -out ~/self-signed.pem -keyout ~/self-signed.key -days 3650 -subj '/CN=arvados.example.com'

Alternatively, download a set from the bottom of the API server page.

Download and configure

Follow these instructions to download the source and configure your workbench instance.

Start the server

Save something like the following at ~/bin/workbench, make it executable1, make sure ~/bin is in your path2:

set -e
cd ~/arvados/apps/workbench
export RAILS_ENV=development
bundle install --path=vendor/bundle
exec bundle exec passenger start -p 3031 --ssl --ssl-certificate ~/self-signed.pem --ssl-certificate-key ~/self-signed.key

The first time you run the above it will take a while to install all the ruby gems. In particular Installing nokogiri takes a while

Once you see:

=============== Phusion Passenger Standalone web server started ===============

You can visit your server at:


You can kill your server with ctrl-C but if you get disconnected from the terminal, it will continue running. You can kill it by running

ps x |grep nginx |grep master

And then

kill ####

Replacing #### with the number in the left column returned by ps

1 chmod +x ~/bin/workbench

2 In Debian systems, the default .profile adds ~/bin to your path, but only if it exists when you log in. If you just created ~/bin, doing exec bash -login or source .profile should make ~/bin appear in your path.

Running tests

The test suite brings up an API server in test mode, and runs browser tests with Firefox.

Make sure API server has its dependencies in place and its database schema up-to-date.

 set -e
 cd ../../services/api
 RAILS_ENV=test bundle install --path=vendor/bundle
 RAILS_ENV=test bundle exec rake db:migrate

Install headless testing tools.

sudo apt-get install xvfb iceweasel

(Install firefox instead of iceweasel if you're not using Debian.)

Install phantomjs. (See http://phantomjs.org/download.html for latest version.)

wget -P /tmp https://bitbucket.org/ariya/phantomjs/downloads/phantomjs-1.9.8-linux-x86_64.tar.bz2
sudo tar -C /usr/local -xjf /tmp/phantomjs-1.9.8-linux-x86_64.tar.bz2
sudo ln -s ../phantomjs-1.9.8-linux-x86_64/bin/phantomjs /usr/local/bin/

Run the test suite.

RAILS_ENV=test bundle exec rake test

When tests fail...

When an integration test fails (or skips) a screenshot is automatically saved in arvados/apps/workbench/tmp/workbench-fail-1.png, etc.

By default, rake test just shows F when a test fails (and E when a test crashes) and doesn't tell you which tests had problems until the entire test suite is done. During development it makes more sense to use TESTOPTS=-v. This reports after each test the test class and name, outcome, and elapsed time:
  • $ RAILS_ENV=test bundle exec rake test TESTOPTS=-v
    ApplicationControllerTest#test_links_for_object = 0.10 s = .
    Saved ./tmp/workbench-fail-2.png
    CollectionsTest#test_combine_selected_collection_files_into_new_collection = 10.89 s = F

Iterating on a single test

Sometimes you want to poke at the code and re-run a single test to confirm that you made it pass. You don't want to reboot everything just to make Minitest notice that you edited your test.

Since #3781 there is a singletest function for this:

arvados/apps/workbench$ RAILS_ENV=test bundle exec irb -Ilib:test
>>> load 'test/test_helper.rb'
>>> singletest 'integration/pipeline_instances_test.rb', 'Create and run a pipeline'
PipelineInstancesTest#test_Create_and_run_a_pipeline = 38.54 s = .

>>> singletest 'integration/pipeline_instances_test.rb', 'Create and run a pipeline'
PipelineInstancesTest#test_Create_and_run_a_pipeline = 29.58 s = .

Loading state from API into models

If your model makes an API call that returns the new state of an object, load the new attributes into the local model with private_reload:

  api_response = $arvados_api_client.api(...)
  private_reload api_response



ApplicationController uses an around_filter to make sure the user is logged in, redirect to Arvados to complete the login procedure if not, and store the user's API token in Thread.current[:arvados_api_token] if so.

The current_user helper returns User.current if the user is logged in, otherwise nil. (Generally, only special pages like "welcome" and "error" get displayed to users who aren't logged in.)

Default filter behavior

before_filter :find_object_by_uuid

  • This is enabled by default, except :index, :create.
  • It renames the :id param to :uuid. (The Rails default routing rules use :id to accept params in path components, but params[:uuid] makes more sense everywhere else in our code.)
  • If you define a collection method (where there's no point looking up an object with the :id supplied in the request), skip this.
  skip_before_filter :find_object_by_uuid, only: [:action_that_takes_no_uuid_param]

Error handling

ApplicationController has a render_error method that shows a standard error page. (It's not very good, but it's better than a default Rails stack trace.)

In a controller you get there like this

  @errors = ['I could not achieve what you wanted.']
  render_error status: 500

You can also do this, anywhere

  raise 'My spoon is too big.'

The render_error method sends JSON or HTML to the client according to the Accept header in the request (it sends JSON if JavaScript was requested), so reasonable things happen whether or not the request is AJAX.

Development patterns

Add a model

Currently, when the API provides a new model, we need to generate a corresponding model in Workbench: it's not smart enough to pick up the list of models from the API server's discovery document.

(Need to fill in details here)
  1. rails generate model ....
  2. Delete migration
  3. Change base class to ArvadosBase
  4. rails generate controller ...

Model attributes, on the other hand, are populated automatically.

Add a configuration knob

Same situation as API server. See Hacking API Server.

Add an API method

Workbench is not yet smart enough to look in the discovery document for supported API methods. You need to add a method to the appropriate model class before you can use it in the Workbench app.

Writing tests

In integration tests, this makes your tests flaky because the result depends on whether the page has finished loading:
  • assert page.has_selector?('a', text: 'foo')  # Danger!
  • Instead, do this:
  • assert_selector('a', text: 'foo')
  • This lets Capybara wait for the selector to appear.

AJAX using Rails UJS (remote:true with JavaScript response)

This pattern is the best way to make a button/link that invokes an asynchronous action on the Workbench server side, i.e., before/without navigating away from the current page.

  1. Add remote: true to a link or button. This makes Rails put a data-remote="true" attribute in the HTML element. Say, in app/views/fizz_buzzes/index.html.erb:
    <%= link_to "Blurfl", blurfl_fizz_buzz_url(id: @object.uuid), {class: 'btn btn-primary', remote: true} %>
  2. Ensure the targeted action responds appropriately to both "js" and "html" requests. At minimum:
    class FizzBuzzesController
      def blurfl
        @howmany = 1
        respond_to do |format|
  3. The html view is used if this is a normal page load (presumably this means the client has turned off JS).
    • app/views/fizz_buzz/blurfl.html.erb
      <p>I am <%= @howmany %></p>
  4. The js view is used if this is an AJAX request. It renders as JavaScript code which will be executed in the browser. Say, in app/views/fizz_buzz/blurfl.js.erb:
    window.alert('I am <%= @howmany %>');
  5. The browser opens an alert box:
    I am 1
  6. A common task is to render a partial and use it to update part of the page. Say the partial is in app/views/fizz_buzz/_latest_news.html.erb:
    var new_content = "<%= escape_javascript(render partial: 'latest_news') %>";
    if ($('div#latest-news').html() != new_content)

TODO: error handling

AJAX invoked from custom JavaScript (JSON response)

(and error handling)

Add JavaScript triggers and fancy behavior

Some guidelines for implementing stuff nicely in JavaScript:
  • Don't rely on the DOM being loaded before your script is loaded.
    • If you need to inspect/alter the DOM as soon as it's loaded, make a setup function that fires on "document ready" and "ajax:complete".
    • jQuery's delegated event pattern can help keep your code clean. See http://api.jquery.com/on/
      // worse:
      $('table.fizzbuzzer tr').
          on('mouseover', function(e, xhr) {
              console.log("This only works if the table exists when this setup script is executed.");
      // better:
          on('mouseover', 'table.fizzbuzzer tr', function(e, xhr) {
              console.log("This works even if the table appears (or has the fizzbuzzer class added) later.");
  • If your code really only makes sense for a particular view, rather than embedding <script> tags in the middle of the page,
    • use this:
      <% content_for :js do %>
      console.log("hurray, this goes in HEAD");
      <% end %>
    • or, if your code should run after [most of] the DOM is loaded:
      <% content_for :footer_js do %>
      console.log("hurray, this runs at the bottom of the BODY element in the default layout.");
      <% end %>
  • Don't just write JavaScript on the fizz_buzzes/blurfl page and rely on the fact that the only table element on the page is the one you want to attach your special behavior to. Instead, add a class to the table, and use a jQuery selector to attach special behavior to it.
    • In app/views/fizz_buzzes/blurfl.html.erb
      <table class="fizzbuzzer">
    • In app/assets/javascripts/fizz_buzzes.js
      <% content_for :js do %>
      $(document).on('mouseover', 'table.fizzbuzzer tr', function() {
      <% end %>
    • Advantage: You can reuse the special behavior in other tables/pages/classes
    • Advantage: The JavaScript can get compiled, minified, cached in the browser, etc., instead of being rendered with every page view
    • Advantage: The JavaScript code is available regardless of how the content got into the DOM (regular page view, partial update with AJAX)
  • If the result of clicking on some link invokes Javascript that will ultimately change the content of the current page using window.location.href= then it is advisable to add to the link the force-cache-reload CSS class. By doing so, when a user uses the browser-back button to return to the original page, it will be forced to reload itself from the server, thereby reflecting the updated content. (Ref: https://arvados.org/issues/3634)

Invoking chooser

Example from app/views/projects/_show_contents.html.erb:

    <%= link_to(
            title: 'Add data to project:',
            multiple: true,
            action_name: 'Add',
            action_href: actions_path(id: @object.uuid),
            action_method: 'post',
            action_data: {selection_param: 'selection[]', copy_selections_into_project: @object.uuid, success: 'page-refresh'}.to_json),
          { class: "btn btn-primary btn-sm", remote: true, method: 'get', data: {'event-after-select' => 'page-refresh'} }) do %>
      <i class="fa fa-fw fa-plus"></i> Add data...
    <% end %>



Infinite scroll

When showing a list that might be too long to render up front in its entirety, use the infinite-scroll feature.

Links/buttons that flip to page 1, 2, 3, etc. (e.g., render partial: "paging") are deprecated.

The comments at the top of source:apps/workbench/app/assets/javascripts/infinite_scroll.js will tell you how to do it.

Filtering lists

When a list is displayed, and the user might want to filter them by selecting a category or typing a search string, use class="filterable". It's easy!

The comments at the top of source:apps/workbench/app/assets/javascripts/filterable.js tell you how to do it.

Tabs/panes on index & show pages


User notifications


Customizing breadcrumbs


Making a page accessible before login


Making a page accessible to non-active users


Developing and Testing the Job Log

To assist with developing and testing the live job log that updates itself via websockets, there is a rake task that will "replay" a log from a file as if it had been generated by a real job. Note that this is done within the API Server context, so you must first switch the current directory appropriately (cd services/api). The task takes up to three arguments:

  • log path and filename
    • The relative path to the log file you want to "replay".
  • time multipler (optional)
    • The speed factor at which this log replay should be simulated. The default is 1.0, or normal speed. Higher numbers will proportionately increase the speed of the simulation. For example "4" will make it so that log entries that normally would have appeared over the course of four minutes will appear over the course of one minute. Numbers between 0 and 1 will slow down the simulation.
  • simulated job uuid (optional)
    • By providing a job UUID to simulate, the rake task will replace the job UUID in the log file with this job UUID. This means that you can be observing the effects on the Log tab of a particular job but use the log file output from another job.

Note that as with all rake tasks, if there are confusing characters in the list of arguments, including spaces separating the arguments, you will need to enclose the rake argument in quotation marks.


  rake "replay_job_log[path/to/your.log, 2.0, qr1hi-8i9sb-nf3qk0xzwwz3lre]"'

A typical testing iteration using this task would work as follows:

  1. Delete the entries from the LOGS table for the Job UUID you will be observing.
  2. Refresh the browser page showing the Job's Log to clear the graph and graph contents.
  3. Run the rake task
  4. Enjoy and/or write beautiful code improvements
  5. Repeat