Story #5781

[API] [DRAFT] Provide API methods for manipulating and combining collections

Added by Radhika Chippada over 4 years ago. Updated over 2 years ago.

Status:
In Progress
Priority:
Normal
Assigned To:
-
Category:
API
Target version:
Start date:
04/21/2015
Due date:
% Done:

0%

Estimated time:
Story points:
2.0

Description

Currently workbench provides “create a new collection by combining selected collections” and “create a new collection by combining selected collection files” functionality. Workbench gets the required collection manifest_text(s), generages the combined manifest_text, invokes save new collection with the combined manifest_text, gets the newly saved collection from server, and finally displays to the user. However, this implementation is not scalable. When very large collections are combined or several collections from a very large collection are combined, the workbench combining collections operations fails. Workbench fails with timeout errors while exchanging these large collection manifest texts with the api server. There are a few bugs reported about this: #4943 and #5614.

A potential solution:

  • Provide a “create by combining” api method that performs the steps currently being performed by the workbench. Thus,
    • This method will take the selections (list of files selected from a collection, or list of collections selected)
    • Generate the manifest_text by combining these selections (using the ruby sdk)
    • Save a new collection with the combined manifest text and the owner_uuid provided
    • Return the newly created collection to the client
  • Hence, this solution requires the exchange of only one manifest text between the api server and workbench (that of the newly created collection), and hence offering much better performance.

Related issues

Blocks Arvados - Bug #4943: [Workbench] [Performance] Combining big collections should start returning a response faster (currently you can get a 502 proxy error even if the collection still combines)New01/08/2015

Blocks Arvados - Story #3821: [Workbench] Delete and rename files in collectionsResolved12/10/2014

History

#1 Updated by Tom Clegg about 4 years ago

  • Subject changed from [API] Provide an api method to combine collections to [API] [DRAFT] Provide API methods for manipulating and combining collections
  • Story points set to 2.0

#2 Updated by Tom Clegg about 4 years ago

  • Target version changed from Arvados Future Sprints to 2015-05-20 sprint

#3 Updated by Radhika Chippada about 4 years ago

  • Assigned To set to Radhika Chippada

#4 Updated by Radhika Chippada about 4 years ago

  • Status changed from New to In Progress

#5 Updated by Radhika Chippada about 4 years ago

On IRC:

tom 4:42 definitely look up the flamegraph gem and try putting &pp=flamegraph at the end of a (dev) workbench url to see what you get

tom 4:43 commit 288d22d8a7ff1f9a441d2b8058382e807873d7d5 message has some notes too

tom 4:44 I think it would be very useful to inspect a single too-slow example request and make a table of how many seconds are spent between various checkpoints.

tom 4:45 if you're running with RAILS_ENV=development, you should be able to use the flamegraph feature

tom 4:47 There are so many things we could optimize... but we should be able to figure out what the maximum possible benefit is in each area

And the commit log mentioned above:

commit 288d22d8a7ff1f9a441d2b8058382e807873d7d5
Author: Tom Clegg <tom@curoverse.com>
Date:   Tue Jan 13 10:18:16 2015 -0500

    3021: Add web-inspectable profiling mode.

    * Run Workbench with environment variable ENABLE_PROFILING=yes. Timing
      figures should appear at the top left of each page. Click to get
      more detail.

    * Visit {workbench-uri}?pp=flamegraph to see a profiling graph instead
      of the requested page itself.

    * More: https://github.com/MiniProfiler/rack-mini-profiler

#6 Updated by Radhika Chippada about 4 years ago

  • File metrics-show-qr1hi-4zz18-tcnxylwkxg0nfhi.png added

Pointed my workbench in dev to production and accessed:

collections/qr1hi-4zz18-tcnxylwkxg0nfhi?pp=full-backtrace

The attached metrics-show png shows the metrics information to show this collection (with ?pp=full-backtrace appended to url)

#7 Updated by Radhika Chippada about 4 years ago

  • File deleted (metrics-show-qr1hi-4zz18-tcnxylwkxg0nfhi.png)

#10 Updated by Brett Smith about 4 years ago

  • Target version changed from 2015-05-20 sprint to Arvados Future Sprints

#12 Updated by Radhika Chippada about 3 years ago

  • Assigned To deleted (Radhika Chippada)

#13 Updated by Tom Morris over 2 years ago

  • Target version changed from Arvados Future Sprints to 2017-03-29 sprint

#14 Updated by Tom Morris over 2 years ago

  • Target version changed from 2017-03-29 sprint to Arvados Future Sprints

Also available in: Atom PDF