Idea #5781
closed[API] [DRAFT] Provide API methods for manipulating and combining collections
Description
Currently workbench provides “create a new collection by combining selected collections” and “create a new collection by combining selected collection files” functionality. Workbench gets the required collection manifest_text(s), generages the combined manifest_text, invokes save new collection with the combined manifest_text, gets the newly saved collection from server, and finally displays to the user. However, this implementation is not scalable. When very large collections are combined or several collections from a very large collection are combined, the workbench combining collections operations fails. Workbench fails with timeout errors while exchanging these large collection manifest texts with the api server. There are a few bugs reported about this: #4943 and #5614.
A potential solution:
- Provide a “create by combining” api method that performs the steps currently being performed by the workbench. Thus,
- This method will take the selections (list of files selected from a collection, or list of collections selected)
- Generate the manifest_text by combining these selections (using the ruby sdk)
- Save a new collection with the combined manifest text and the owner_uuid provided
- Return the newly created collection to the client
- Hence, this solution requires the exchange of only one manifest text between the api server and workbench (that of the newly created collection), and hence offering much better performance.
Files
Updated by Tom Clegg over 9 years ago
- Subject changed from [API] Provide an api method to combine collections to [API] [DRAFT] Provide API methods for manipulating and combining collections
- Story points set to 2.0
Updated by Tom Clegg over 9 years ago
- Target version changed from Arvados Future Sprints to 2015-05-20 sprint
Updated by Radhika Chippada over 9 years ago
- Assigned To set to Radhika Chippada
Updated by Radhika Chippada over 9 years ago
- Status changed from New to In Progress
Updated by Radhika Chippada over 9 years ago
On IRC:
tom 4:42 definitely look up the flamegraph gem and try putting &pp=flamegraph at the end of a (dev) workbench url to see what you get
tom 4:43 commit 288d22d8a7ff1f9a441d2b8058382e807873d7d5 message has some notes too
tom 4:44 I think it would be very useful to inspect a single too-slow example request and make a table of how many seconds are spent between various checkpoints.
tom 4:45 if you're running with RAILS_ENV=development, you should be able to use the flamegraph feature
tom 4:47 There are so many things we could optimize... but we should be able to figure out what the maximum possible benefit is in each area
And the commit log mentioned above: commit 288d22d8a7ff1f9a441d2b8058382e807873d7d5 Author: Tom Clegg <tom@curoverse.com> Date: Tue Jan 13 10:18:16 2015 -0500 3021: Add web-inspectable profiling mode. * Run Workbench with environment variable ENABLE_PROFILING=yes. Timing figures should appear at the top left of each page. Click to get more detail. * Visit {workbench-uri}?pp=flamegraph to see a profiling graph instead of the requested page itself. * More: https://github.com/MiniProfiler/rack-mini-profiler
Updated by Radhika Chippada over 9 years ago
- File metrics-show-qr1hi-4zz18-tcnxylwkxg0nfhi.png added
Pointed my workbench in dev to production and accessed:
collections/qr1hi-4zz18-tcnxylwkxg0nfhi?pp=full-backtrace
The attached metrics-show png shows the metrics information to show this collection (with ?pp=full-backtrace appended to url)
Updated by Radhika Chippada over 9 years ago
- File deleted (
metrics-show-qr1hi-4zz18-tcnxylwkxg0nfhi.png)
Updated by Radhika Chippada over 9 years ago
Updated by Radhika Chippada over 9 years ago
Updated by Brett Smith over 9 years ago
- Target version changed from 2015-05-20 sprint to Arvados Future Sprints
Updated by Peter Amstutz over 9 years ago
Proposed Collection Update API:
https://arvados.org/projects/arvados/wiki/Collection_update_API
Updated by Radhika Chippada over 8 years ago
- Assigned To deleted (
Radhika Chippada)
Updated by Tom Morris almost 8 years ago
- Target version changed from Arvados Future Sprints to 2017-03-29 sprint
Updated by Tom Morris almost 8 years ago
- Target version changed from 2017-03-29 sprint to Arvados Future Sprints
Updated by Ward Vandewege over 3 years ago
- Target version deleted (
Arvados Future Sprints)
Updated by Peter Amstutz over 1 year ago
- Release deleted (
60) - Status changed from In Progress to Resolved
Now implemented with the replace_files
API