Collection API - Performance enhancements » History » Revision 8

Revision 7 (Radhika Chippada, 05/11/2015 08:00 PM) → Revision 8/14 (Radhika Chippada, 05/12/2015 02:24 AM)

h1. Collection API - Performance enhancements 

 h2. Problem description 

 Currently, we are experiencing severe performance issues when working with large collections in Arvados. Below are a few scenario descriptions. 

 h3. 1. Fetching a large collection 

 Fetching a collection with large manifest text from API server results in timeout errors. This is suspected to be either the root cause or contributing largely to the other issues listed below. Several issues are reported which are the side effects of this issue: #4953, #4943,    #5614, #5901, #5902  

 h3. 2. Collection#show in workbench 

 Often times, we see timeout errors in workbench when showing a collection page with large manifest text. It may be mostly due to the above listed concern about fetching the large collections. #5902, #5908 

 h3. 3. Create a collection by combining 

 Creating new collections by combining other collections or several files from a collection almost always fail when one of more of the involved collections contain large manifest texts. A few issues about this: #4943,    #5614 

 h2. Proposed solutions  

 Various operations dealing with these large manifest texts are certainly the cause of these performance issues. Sending and receiving the manifest text to and from the api server to clients, json encoding and decoding of these large manifest texts could be contributing to this performance issues. Reducing the amount of data and the number of times this data is exchanged can greatly help. 

 h3. 1. Fetching a large collection 

 * Compress the data transferred (We recently enabled gzip compression between API and workbench) 

 * Use efficient json encoding / decoding 
 ** We are using Oj between API server and Workbench. Is there room for further improvement? ( 
 ** Are we consistently using Oj in Ruby SDK? (Radhika: I need to do further research to answer this question) 

 * Send the data in smaller chunks (?) 
 ** Is it possible for us to implement some form of “paging” strategy in sending the manifest text to the clients from the API server? 

 h3. 2. Collection#show in workbench 

 h4. Observations 

 Collection#show responses are profiled using rack-mini-profiler. When pointed the development environment to qr1hi api server, the following observations are made (based on about 20+ reloads of the page): 

 * On average it took about 70s for to show the collection qr1hi-4zz18-tcnxylwkxg0nfhi 

 * The most expensive operations (on average) are: 
 ** collections/_show_source_summary    -- 30 seconds 
 ** collections/show (api request to get collection) -- 15 sec 
 *** It took on average .2 sec to parse response (json) 
 ** collections/_show_files    -- 15 sec 
 ** applications/_projects_tree_menu -- 3 to 4 sec 
 *** For this collection, 6 requests were made to /groups each taking .2 to .5sec 

 * It is also observed that the requests took an average of 120 seconds on May 08, probably when the server was much busier and hence cluster tuning is also called for. 

 * {{collapse(Performance profile snapshot ...) 


 * {{collapse(Workbench console log ...) 

 Started GET "/collections/qr1hi-4zz18-tcnxylwkxg0nfhi" for at 2015-05-11 14:47:03 -0400 
 Processing by CollectionsController#show as HTML 
   Parameters: {"id"=>"qr1hi-4zz18-tcnxylwkxg0nfhi"} 
 API client: 0.0007654 Prepare request     
 API client: 0.313339245 API transaction 
 API client: 0.000289537 Parse response 
 API client: 9.8434e-05 Prepare request     
 API client: 0.250356943 API transaction 
 API client: 0.005898489 Parse response 
 API client: 0.000356541 Prepare request     
 API client: 21.405180053 API transaction 
 API client: 0.170310714 Parse response 
 API client: 0.000316374 Prepare request    {"output":"55152d2b989c6b174e298dba10ae3ff7+57708684"}   
 API client: 0.221916151 API transaction 
 API client: 0.000178293 Parse response 
 API client: 0.000356427 Prepare request    {"log":"55152d2b989c6b174e298dba10ae3ff7+57708684"}   
 API client: 0.078525257 API transaction 
 API client: 0.00017414 Parse response 
 API client: 0.000424393 Prepare request    {"head_uuid":"qr1hi-4zz18-tcnxylwkxg0nfhi","link_class":"name"}    modified_at DESC 
 API client: 0.059534807 API transaction 
 API client: 0.000152869 Parse response 
 API client: 0.000302943 Prepare request    {"uuid":[]}   
 API client: 0.06070466 API transaction 
 API client: 0.000145646 Parse response 
 API client: 0.00029954 Prepare request    {"head_uuid":"qr1hi-4zz18-tcnxylwkxg0nfhi","link_class":"permission","name":"can_read"}    modified_at DESC 
 API client: 0.062907368 API transaction 
 API client: 0.000149375 Parse response 
 API client: 0.00028907 Prepare request    {"object_uuid":"qr1hi-4zz18-tcnxylwkxg0nfhi"}    created_at DESC 
 API client: 0.079528074 API transaction 
 API client: 0.00016842 Parse response 
 API client: 0.000523936 Prepare request    {"head_uuid":"qr1hi-4zz18-tcnxylwkxg0nfhi","tail_uuid":"qr1hi-tpzed-ktpvhqu89qoib9f","link_class":"resources","name":"wants"}   
 API client: 0.062874638 API transaction 
 API client: 0.000175978 Parse response 
 API client: 0.000377932 Prepare request     [["scopes","=",["GET /arvados/v1/collections/qr1hi-4zz18-tcnxylwkxg0nfhi","GET /arvados/v1/collections/qr1hi-4zz18-tcnxylwkxg0nfhi/","GET /arvados/v1/keep_services/accessible"]]]  
 API client: 0.158867905 API transaction 
   Rendered application/_show_autoselect_text.html.erb (0.9ms) 
   Rendered application/_show_autoselect_text.html.erb (0.2ms) 
   Rendered collections/_show_source_summary.html.erb (26534.7ms) 
   Rendered collections/_sharing_button.html.erb (1.1ms) 
 API client: 0.000270093 Prepare request     
 API client: 0.227171335 API transaction 
 API client: 0.000226508 Parse response 
   Rendered application/_title_and_buttons.html.erb (239.8ms) 
   Rendered collections/_show_files.html.erb (14531.8ms) 
   Rendered application/_loading_modal.html.erb (1.5ms) 
   Rendered application/_content.html.erb (14545.1ms) 
   Rendered application/show.html.erb (14790.7ms) 
   Rendered collections/show.html.erb within layouts/application (41352.7ms) 
 API client: 0.000296285 Prepare request    {"authorized_user_uuid":"qr1hi-tpzed-ktpvhqu89qoib9f"}   
 API client: 0.151415184 API transaction 
 API client: 0.000207343 Parse response 
 API client: 0.000238512 Prepare request    {"created_by":"qr1hi-tpzed-ktpvhqu89qoib9f"}   
 API client: 0.330716574 API transaction 
 API client: 0.000160901 Parse response 
 API client: 0.000376137 Prepare request    {"created_by":"qr1hi-tpzed-ktpvhqu89qoib9f"}   
 API client: 0.263794223 API transaction 
 API client: 0.000841214 Parse response 
 API client: 0.000266079 Prepare request     [["group_class","=","project"]] name 
 API client: 0.968549864 API transaction 
 API client: 0.001104326 Parse response 
 API client: 0.001218433 Prepare request     [["group_class","=","project"]] name 
 API client: 1.214942211 API transaction 
 API client: 0.001293295 Parse response 
 API client: 0.001586803 Prepare request     [["group_class","=","project"]] name 
 API client: 0.737031104 API transaction 
 API client: 0.000921665 Parse response 
 API client: 0.000487645 Prepare request     [["group_class","=","project"]] name 
 API client: 0.848679769 API transaction 
 API client: 0.002339424 Parse response 
 API client: 0.000264704 Prepare request     [["group_class","=","project"]] name 
 API client: 0.681490624 API transaction 
 API client: 0.000958536 Parse response 
 API client: 0.000484713 Prepare request     [["group_class","=","project"]] name 
 API client: 0.412487956 API transaction 
 API client: 0.000900607 Parse response 
   Rendered application/_projects_tree_menu.html.erb (5474.8ms) 
 API client: 0.00045925 Prepare request     
 API client: 0.078490258 API transaction 
   Rendered application/_browser_unsupported.html (0.6ms) 
   Rendered getting_started/_getting_started_popup.html.erb (2.7ms) 
   Rendered layouts/body.html.erb (6343.2ms) 
 Completed 200 OK in 70876ms (Views: 47804.5ms | ActiveRecord: 0.0ms) 

 h4. Proposed enhancements 

 * API: Add files_count and files_size to collection data model 
 ** Rather than computing it for each page display, we should consider adding these into the data model and update them when manifest_text changes 

 * Implement paging / scrolling in the collection#show page(?). Get “pages” of collection and display them as needed. 
 ** This will address the next two big ticket items (the time taken in getting the collection json from API and _show_files 
 ** This might also be inevitable for even larger collection than the one used in this profiling exercise 

 * Avoid making multiple calls to the API server for the same data by caching or preloading data (See #5908) 
 ** Clicking on the Advanced tab resulted in making one more call to the API server to get the collection (which as seen above takes an average of 15 seconds or more) 
 ** Cache the collection and other objects in workbench and avoid making unnecessary calls to the API server (while in the same page context) 

 * Show less information in the collection page (such as not linking images that are going to 404)? (See #5908) 

 * Add methods in the API server (?) to get my_projects and shared_project trees in one call and hence eliminating the average 3 seconds or so lag for "each" page display 

 h3. 3. Create a collection by combining 

 h4. Observations 
 Creating a new collection by combining collections is profiled by combining qr1hi-4zz18-ms5x87xf1389ldv, qr1hi-4zz18-0q225z4ktr432mg, qr1hi-4zz18-i5o4ba4mmxub69b from the project qr1hi-j7d0g-3d06b1jtiwrizqm (#4943) 

 * It took about 110 seconds to generate the combined manifest text, save new collection making an API server, get API server response for save 
 ** The server sent the new collection, including manifest_text, after save 

 * It took an additional 70 seconds to "show" the new collection 
 ** The workbench made yet another GET /collections/<uuid> request (about 20 seconds per the log), even though the server just sent it after saving 
 ** All the other delays as listed in collection#show section above are part of this lag 

 * {{collapse(Performance profile snapshot ...) 


 * {{collapse(Workbench log ...) 
 Started POST "/combine_selected?action_data=%7B%22current_project_uuid%22%3A%22qr1hi-j7d0g-3d06b1jtiwrizqm%22%7D" for at 2015-05-11 22:14:55 -0400 
 Processing by ActionsController#combine_selected_files_into_collection as HTML 
   Parameters: {"authenticity_token"=>"U0PnyOLtNoz1Bet+ZUcC2+/1DTqN9Imzwhsk9VFGvro=", "selection"=>["qr1hi-4zz18-ms5x87xf1389ldv", "qr1hi-4zz18-0q225z4ktr432mg", "qr1hi-4zz18-i5o4ba4mmxub69b"], "action_data"=>"{\"current_project_uuid\":\"qr1hi-j7d0g-3d06b1jtiwrizqm\"}"} 
 API client: 0.000323633 Prepare request     
 API client: 0.219897489 API transaction 
 API client: 0.000180886 Parse response 
 API client: 0.000744984 Prepare request    {"uuid":["qr1hi-4zz18-ms5x87xf1389ldv","qr1hi-4zz18-0q225z4ktr432mg","qr1hi-4zz18-i5o4ba4mmxub69b"]}   
 API client: 16.46763541 API transaction 
 API client: 0.174250085 Parse response 
 API client: 0.000405262 Prepare request     
 API client: 0.242986425 API transaction 
 API client: 0.000161982 Parse response 
 API client: 1.334179882 Prepare request     
 API client: 62.347996094 API transaction 
 API client: 0.129231689 Parse response 
 API client: 0.000315959 Prepare request     
 API client: 0.247526392 API transaction 
 API client: 0.000215447 Parse response 
 API client: 0.000176321 Prepare request     
 API client: 0.09771474 API transaction 
 API client: 0.00015503 Parse response 
 API client: 0.000181164 Prepare request     
 API client: 0.088184468 API transaction 
 API client: 0.00018323 Parse response 
 Redirected to https://localhost:3031/collections/qr1hi-4zz18-cdwp2zwz1nj3uvh 
 Completed 302 Found in 109373ms (ActiveRecord: 0.0ms) 

 Started GET "/collections/qr1hi-4zz18-cdwp2zwz1nj3uvh" for at 2015-05-11 22:16:44 -0400 
 Processing by CollectionsController#show as HTML 
   Parameters: {"id"=>"qr1hi-4zz18-cdwp2zwz1nj3uvh"} 
 API client: 0.000233106 Prepare request     
 API client: 0.069534939 API transaction 
 API client: 0.000270891 Parse response 
 API client: 0.00174422 Prepare request     
 API client: 17.157845279 API transaction 
 API client: 0.13215165 Parse response 

 * Offer an API server method that accepts the selections array (and optionally owner_uuid and name) and performs the creation of the new collection in the backend. Doing so can help as follows: 
 ** When combining entire collections: We can completely eliminate the need to fetch the manifest text for the collections in workbench. Also, workbench would no longer need to work through the combining logic and generate the manifest text for the new collection to be created. No need to do JSON decode and encode the manifest text. Lastly, it would not need to send this manifest text to the API server on the wire. Instead, the API server can do all these steps on the server and create the new collection and send the generated collection uuid to workbench (which will then reduce the performance issue down to collection#show issue; yay) 
 ** When combining selected files from within a collection: Here also, we can see significant performance improvements by eliminating need to generate the combined manifest text and sending it on wire. 


 h3. 4. Implement caching using a framework such as Memcache 

 * One of the issues listed above (#5901) is around being able to access collection in multiple threads in parallel. Also, #5908 highlights several API requests being repeated within one page display. In fact, we have this issue in several areas of workbench implementation. 

 * By implementing caching, we will be able to reduce the need to make round trip API requests to fetch these objects. Instead, we can improve performance by fetching these objects from the shared cache.  

 * Question: Not sure how caching would work if / when we cache these huge collections.