Collection API - Performance enhancements » History » Version 12

Peter Amstutz, 05/19/2015 01:32 PM

1 1 Radhika Chippada
h1. Collection API - Performance enhancements
2 1 Radhika Chippada
3 1 Radhika Chippada
h2. Problem description
4 1 Radhika Chippada
5 1 Radhika Chippada
Currently, we are experiencing severe performance issues when working with large collections in Arvados. Below are a few scenario descriptions.
6 1 Radhika Chippada
7 1 Radhika Chippada
h3. 1. Fetching a large collection
8 1 Radhika Chippada
9 1 Radhika Chippada
Fetching a collection with large manifest text from API server results in timeout errors. This is suspected to be either the root cause or contributing largely to the other issues listed below. Several issues are reported which are the side effects of this issue: #4953, #4943,  #5614, #5901, #5902 
10 1 Radhika Chippada
11 1 Radhika Chippada
h3. 2. Collection#show in workbench
12 1 Radhika Chippada
13 1 Radhika Chippada
Often times, we see timeout errors in workbench when showing a collection page with large manifest text. It may be mostly due to the above listed concern about fetching the large collections. #5902, #5908
14 1 Radhika Chippada
15 1 Radhika Chippada
h3. 3. Create a collection by combining
16 1 Radhika Chippada
17 1 Radhika Chippada
Creating new collections by combining other collections or several files from a collection almost always fail when one of more of the involved collections contain large manifest texts. A few issues about this: #4943,  #5614
18 1 Radhika Chippada
19 1 Radhika Chippada
h2. Proposed solutions 
20 1 Radhika Chippada
21 1 Radhika Chippada
Various operations dealing with these large manifest texts are certainly the cause of these performance issues. Sending and receiving the manifest text to and from the api server to clients, json encoding and decoding of these large manifest texts could be contributing to this performance issues. Reducing the amount of data and the number of times this data is exchanged can greatly help.
22 1 Radhika Chippada
23 1 Radhika Chippada
h3. 1. Fetching a large collection
24 1 Radhika Chippada
25 1 Radhika Chippada
* Compress the data transferred (We recently enabled gzip compression between API and workbench)
26 1 Radhika Chippada
27 1 Radhika Chippada
* Use efficient json encoding / decoding
28 1 Radhika Chippada
** We are using Oj between API server and Workbench. Is there room for further improvement? (http://devblog.agworld.com.au/post/42586025923/the-performance-of-to-json-in-rails-sucks-and)
29 1 Radhika Chippada
** Are we consistently using Oj in Ruby SDK? (Radhika: I need to do further research to answer this question)
30 1 Radhika Chippada
31 1 Radhika Chippada
* Send the data in smaller chunks (?)
32 1 Radhika Chippada
** Is it possible for us to implement some form of “paging” strategy in sending the manifest text to the clients from the API server?
33 1 Radhika Chippada
34 1 Radhika Chippada
h3. 2. Collection#show in workbench
35 1 Radhika Chippada
36 6 Radhika Chippada
h4. Observations
37 6 Radhika Chippada
38 2 Radhika Chippada
Collection#show responses are profiled using rack-mini-profiler. When pointed the development environment to qr1hi api server, the following observations are made (based on about 20+ reloads of the page):
39 2 Radhika Chippada
40 2 Radhika Chippada
* On average it took about 70s for to show the collection qr1hi-4zz18-tcnxylwkxg0nfhi
41 2 Radhika Chippada
42 2 Radhika Chippada
* The most expensive operations (on average) are:
43 2 Radhika Chippada
** collections/_show_source_summary  -- 30 seconds
44 2 Radhika Chippada
** collections/show (api request to get collection) -- 15 sec
45 2 Radhika Chippada
*** It took on average .2 sec to parse response (json)
46 11 Radhika Chippada
*** It sometimes take as long as a 60+ seconds for the same collection GET operation
47 2 Radhika Chippada
** collections/_show_files  -- 15 sec
48 2 Radhika Chippada
** applications/_projects_tree_menu -- 3 to 4 sec
49 2 Radhika Chippada
*** For this collection, 6 requests were made to /groups each taking .2 to .5sec
50 2 Radhika Chippada
51 1 Radhika Chippada
* It is also observed that the requests took an average of 120 seconds on May 08, probably when the server was much busier and hence cluster tuning is also called for.
52 1 Radhika Chippada
53 6 Radhika Chippada
* {{collapse(Performance profile snapshot ...)
54 1 Radhika Chippada
55 7 Radhika Chippada
!{width:80%}https://arvados.org/attachments/download/595/perf-profile-qr1hi-4zz18-tcnxylwkxg0nfhi.png!
56 5 Radhika Chippada
}}
57 1 Radhika Chippada
58 6 Radhika Chippada
* {{collapse(Workbench console log ...)
59 5 Radhika Chippada
60 2 Radhika Chippada
<pre>
61 2 Radhika Chippada
Started GET "/collections/qr1hi-4zz18-tcnxylwkxg0nfhi" for 127.0.0.1 at 2015-05-11 14:47:03 -0400
62 2 Radhika Chippada
Processing by CollectionsController#show as HTML
63 2 Radhika Chippada
  Parameters: {"id"=>"qr1hi-4zz18-tcnxylwkxg0nfhi"}
64 2 Radhika Chippada
API client: 0.0007654 Prepare request https://qr1hi.arvadosapi.com/arvados/v1/users/current    
65 2 Radhika Chippada
API client: 0.313339245 API transaction
66 2 Radhika Chippada
API client: 0.000289537 Parse response
67 2 Radhika Chippada
API client: 9.8434e-05 Prepare request https://qr1hi.arvadosapi.com/discovery/v1/apis/arvados/v1/rest    
68 2 Radhika Chippada
API client: 0.250356943 API transaction
69 2 Radhika Chippada
API client: 0.005898489 Parse response
70 2 Radhika Chippada
API client: 0.000356541 Prepare request https://qr1hi.arvadosapi.com/arvados/v1/collections/qr1hi-4zz18-tcnxylwkxg0nfhi    
71 2 Radhika Chippada
API client: 21.405180053 API transaction
72 2 Radhika Chippada
API client: 0.170310714 Parse response
73 2 Radhika Chippada
API client: 0.000316374 Prepare request https://qr1hi.arvadosapi.com/arvados/v1/jobs  {"output":"55152d2b989c6b174e298dba10ae3ff7+57708684"}  
74 2 Radhika Chippada
API client: 0.221916151 API transaction
75 2 Radhika Chippada
API client: 0.000178293 Parse response
76 2 Radhika Chippada
API client: 0.000356427 Prepare request https://qr1hi.arvadosapi.com/arvados/v1/jobs  {"log":"55152d2b989c6b174e298dba10ae3ff7+57708684"}  
77 2 Radhika Chippada
API client: 0.078525257 API transaction
78 2 Radhika Chippada
API client: 0.00017414 Parse response
79 2 Radhika Chippada
API client: 0.000424393 Prepare request https://qr1hi.arvadosapi.com/arvados/v1/links  {"head_uuid":"qr1hi-4zz18-tcnxylwkxg0nfhi","link_class":"name"}  modified_at DESC
80 2 Radhika Chippada
API client: 0.059534807 API transaction
81 2 Radhika Chippada
API client: 0.000152869 Parse response
82 2 Radhika Chippada
API client: 0.000302943 Prepare request https://qr1hi.arvadosapi.com/arvados/v1/groups  {"uuid":[]}  
83 2 Radhika Chippada
API client: 0.06070466 API transaction
84 2 Radhika Chippada
API client: 0.000145646 Parse response
85 2 Radhika Chippada
API client: 0.00029954 Prepare request https://qr1hi.arvadosapi.com/arvados/v1/links  {"head_uuid":"qr1hi-4zz18-tcnxylwkxg0nfhi","link_class":"permission","name":"can_read"}  modified_at DESC
86 2 Radhika Chippada
API client: 0.062907368 API transaction
87 2 Radhika Chippada
API client: 0.000149375 Parse response
88 2 Radhika Chippada
API client: 0.00028907 Prepare request https://qr1hi.arvadosapi.com/arvados/v1/logs  {"object_uuid":"qr1hi-4zz18-tcnxylwkxg0nfhi"}  created_at DESC
89 2 Radhika Chippada
API client: 0.079528074 API transaction
90 2 Radhika Chippada
API client: 0.00016842 Parse response
91 2 Radhika Chippada
API client: 0.000523936 Prepare request https://qr1hi.arvadosapi.com/arvados/v1/links  {"head_uuid":"qr1hi-4zz18-tcnxylwkxg0nfhi","tail_uuid":"qr1hi-tpzed-ktpvhqu89qoib9f","link_class":"resources","name":"wants"}  
92 2 Radhika Chippada
API client: 0.062874638 API transaction
93 2 Radhika Chippada
API client: 0.000175978 Parse response
94 2 Radhika Chippada
API client: 0.000377932 Prepare request https://qr1hi.arvadosapi.com/arvados/v1/api_client_authorizations   [["scopes","=",["GET /arvados/v1/collections/qr1hi-4zz18-tcnxylwkxg0nfhi","GET /arvados/v1/collections/qr1hi-4zz18-tcnxylwkxg0nfhi/","GET /arvados/v1/keep_services/accessible"]]] 
95 2 Radhika Chippada
API client: 0.158867905 API transaction
96 2 Radhika Chippada
  Rendered application/_show_autoselect_text.html.erb (0.9ms)
97 2 Radhika Chippada
  Rendered application/_show_autoselect_text.html.erb (0.2ms)
98 2 Radhika Chippada
  Rendered collections/_show_source_summary.html.erb (26534.7ms)
99 2 Radhika Chippada
  Rendered collections/_sharing_button.html.erb (1.1ms)
100 2 Radhika Chippada
API client: 0.000270093 Prepare request https://qr1hi.arvadosapi.com/arvados/v1/groups/qr1hi-j7d0g-ixtfnxccvzm7ui4    
101 2 Radhika Chippada
API client: 0.227171335 API transaction
102 2 Radhika Chippada
API client: 0.000226508 Parse response
103 2 Radhika Chippada
  Rendered application/_title_and_buttons.html.erb (239.8ms)
104 2 Radhika Chippada
  Rendered collections/_show_files.html.erb (14531.8ms)
105 2 Radhika Chippada
  Rendered application/_loading_modal.html.erb (1.5ms)
106 2 Radhika Chippada
  Rendered application/_content.html.erb (14545.1ms)
107 2 Radhika Chippada
  Rendered application/show.html.erb (14790.7ms)
108 2 Radhika Chippada
  Rendered collections/show.html.erb within layouts/application (41352.7ms)
109 2 Radhika Chippada
API client: 0.000296285 Prepare request https://qr1hi.arvadosapi.com/arvados/v1/authorized_keys  {"authorized_user_uuid":"qr1hi-tpzed-ktpvhqu89qoib9f"}  
110 2 Radhika Chippada
API client: 0.151415184 API transaction
111 2 Radhika Chippada
API client: 0.000207343 Parse response
112 2 Radhika Chippada
API client: 0.000238512 Prepare request https://qr1hi.arvadosapi.com/arvados/v1/collections  {"created_by":"qr1hi-tpzed-ktpvhqu89qoib9f"}  
113 2 Radhika Chippada
API client: 0.330716574 API transaction
114 2 Radhika Chippada
API client: 0.000160901 Parse response
115 2 Radhika Chippada
API client: 0.000376137 Prepare request https://qr1hi.arvadosapi.com/arvados/v1/pipeline_instances  {"created_by":"qr1hi-tpzed-ktpvhqu89qoib9f"}  
116 2 Radhika Chippada
API client: 0.263794223 API transaction
117 2 Radhika Chippada
API client: 0.000841214 Parse response
118 2 Radhika Chippada
API client: 0.000266079 Prepare request https://qr1hi.arvadosapi.com/arvados/v1/groups   [["group_class","=","project"]] name
119 2 Radhika Chippada
API client: 0.968549864 API transaction
120 2 Radhika Chippada
API client: 0.001104326 Parse response
121 2 Radhika Chippada
API client: 0.001218433 Prepare request https://qr1hi.arvadosapi.com/arvados/v1/groups   [["group_class","=","project"]] name
122 2 Radhika Chippada
API client: 1.214942211 API transaction
123 2 Radhika Chippada
API client: 0.001293295 Parse response
124 2 Radhika Chippada
API client: 0.001586803 Prepare request https://qr1hi.arvadosapi.com/arvados/v1/groups   [["group_class","=","project"]] name
125 2 Radhika Chippada
API client: 0.737031104 API transaction
126 2 Radhika Chippada
API client: 0.000921665 Parse response
127 2 Radhika Chippada
API client: 0.000487645 Prepare request https://qr1hi.arvadosapi.com/arvados/v1/groups   [["group_class","=","project"]] name
128 2 Radhika Chippada
API client: 0.848679769 API transaction
129 2 Radhika Chippada
API client: 0.002339424 Parse response
130 2 Radhika Chippada
API client: 0.000264704 Prepare request https://qr1hi.arvadosapi.com/arvados/v1/groups   [["group_class","=","project"]] name
131 2 Radhika Chippada
API client: 0.681490624 API transaction
132 2 Radhika Chippada
API client: 0.000958536 Parse response
133 2 Radhika Chippada
API client: 0.000484713 Prepare request https://qr1hi.arvadosapi.com/arvados/v1/groups   [["group_class","=","project"]] name
134 2 Radhika Chippada
API client: 0.412487956 API transaction
135 2 Radhika Chippada
API client: 0.000900607 Parse response
136 2 Radhika Chippada
  Rendered application/_projects_tree_menu.html.erb (5474.8ms)
137 2 Radhika Chippada
API client: 0.00045925 Prepare request https://qr1hi.arvadosapi.com/arvados/v1/groups/qr1hi-tpzed-rv6f0l8lbvdzkog    
138 2 Radhika Chippada
API client: 0.078490258 API transaction
139 2 Radhika Chippada
  Rendered application/_browser_unsupported.html (0.6ms)
140 2 Radhika Chippada
  Rendered getting_started/_getting_started_popup.html.erb (2.7ms)
141 1 Radhika Chippada
  Rendered layouts/body.html.erb (6343.2ms)
142 1 Radhika Chippada
Completed 200 OK in 70876ms (Views: 47804.5ms | ActiveRecord: 0.0ms)
143 2 Radhika Chippada
</pre>
144 1 Radhika Chippada
}}
145 1 Radhika Chippada
146 6 Radhika Chippada
h4. Proposed enhancements
147 1 Radhika Chippada
148 6 Radhika Chippada
* API: Add files_count and files_size to collection data model
149 6 Radhika Chippada
** Rather than computing it for each page display, we should consider adding these into the data model and update them when manifest_text changes
150 6 Radhika Chippada
151 6 Radhika Chippada
* Implement paging / scrolling in the collection#show page(?). Get “pages” of collection and display them as needed.
152 6 Radhika Chippada
** This will address the next two big ticket items (the time taken in getting the collection json from API and _show_files
153 6 Radhika Chippada
** This might also be inevitable for even larger collection than the one used in this profiling exercise
154 6 Radhika Chippada
155 1 Radhika Chippada
* Avoid making multiple calls to the API server for the same data by caching or preloading data (See #5908)
156 6 Radhika Chippada
** Clicking on the Advanced tab resulted in making one more call to the API server to get the collection (which as seen above takes an average of 15 seconds or more)
157 6 Radhika Chippada
** Cache the collection and other objects in workbench and avoid making unnecessary calls to the API server (while in the same page context)
158 1 Radhika Chippada
159 1 Radhika Chippada
* Show less information in the collection page (such as not linking images that are going to 404)? (See #5908)
160 6 Radhika Chippada
161 6 Radhika Chippada
* Add methods in the API server (?) to get my_projects and shared_project trees in one call and hence eliminating the average 3 seconds or so lag for "each" page display
162 1 Radhika Chippada
163 1 Radhika Chippada
h3. 3. Create a collection by combining
164 1 Radhika Chippada
165 8 Radhika Chippada
h4. Observations
166 1 Radhika Chippada
167 9 Radhika Chippada
Creating a new collection by combining is profiled for qr1hi-4zz18-ms5x87xf1389ldv, qr1hi-4zz18-0q225z4ktr432mg, qr1hi-4zz18-i5o4ba4mmxub69b from the project qr1hi-j7d0g-3d06b1jtiwrizqm (#4943)
168 9 Radhika Chippada
169 8 Radhika Chippada
* It took about 110 seconds to generate the combined manifest text, save new collection making an API server, get API server response for save
170 8 Radhika Chippada
** The server sent the new collection, including manifest_text, after save
171 1 Radhika Chippada
172 8 Radhika Chippada
* It took an additional 70 seconds to "show" the new collection
173 9 Radhika Chippada
** The workbench made yet another GET /collections/<uuid> request (about 18 seconds), even though the server just sent it after saving
174 1 Radhika Chippada
** All the other delays as listed in collection#show section above are part of this lag
175 1 Radhika Chippada
176 1 Radhika Chippada
177 9 Radhika Chippada
* {{collapse(Performance profile snapshots ...)
178 1 Radhika Chippada
179 9 Radhika Chippada
!{width:80%}https://arvados.org/attachments/download/597/perf-profile-combine-collections.png!
180 9 Radhika Chippada
181 10 Radhika Chippada
 
182 10 Radhika Chippada
183 9 Radhika Chippada
!{width:80%}https://arvados.org/attachments/download/596/perf-profile-combine-collections-show-collection.png!
184 9 Radhika Chippada
185 8 Radhika Chippada
}}
186 8 Radhika Chippada
187 8 Radhika Chippada
* {{collapse(Workbench log ...)
188 9 Radhika Chippada
189 8 Radhika Chippada
<pre>
190 8 Radhika Chippada
Started POST "/combine_selected?action_data=%7B%22current_project_uuid%22%3A%22qr1hi-j7d0g-3d06b1jtiwrizqm%22%7D" for 127.0.0.1 at 2015-05-11 22:14:55 -0400
191 8 Radhika Chippada
Processing by ActionsController#combine_selected_files_into_collection as HTML
192 8 Radhika Chippada
  Parameters: {"authenticity_token"=>"U0PnyOLtNoz1Bet+ZUcC2+/1DTqN9Imzwhsk9VFGvro=", "selection"=>["qr1hi-4zz18-ms5x87xf1389ldv", "qr1hi-4zz18-0q225z4ktr432mg", "qr1hi-4zz18-i5o4ba4mmxub69b"], "action_data"=>"{\"current_project_uuid\":\"qr1hi-j7d0g-3d06b1jtiwrizqm\"}"}
193 8 Radhika Chippada
API client: 0.000323633 Prepare request https://qr1hi.arvadosapi.com/arvados/v1/users/current    
194 8 Radhika Chippada
API client: 0.219897489 API transaction
195 8 Radhika Chippada
API client: 0.000180886 Parse response
196 8 Radhika Chippada
API client: 0.000744984 Prepare request https://qr1hi.arvadosapi.com/arvados/v1/collections  {"uuid":["qr1hi-4zz18-ms5x87xf1389ldv","qr1hi-4zz18-0q225z4ktr432mg","qr1hi-4zz18-i5o4ba4mmxub69b"]}  
197 8 Radhika Chippada
API client: 16.46763541 API transaction
198 8 Radhika Chippada
API client: 0.174250085 Parse response
199 8 Radhika Chippada
API client: 0.000405262 Prepare request https://qr1hi.arvadosapi.com/arvados/v1/groups/qr1hi-j7d0g-3d06b1jtiwrizqm    
200 8 Radhika Chippada
API client: 0.242986425 API transaction
201 8 Radhika Chippada
API client: 0.000161982 Parse response
202 8 Radhika Chippada
API client: 1.334179882 Prepare request https://qr1hi.arvadosapi.com/arvados/v1/collections    
203 8 Radhika Chippada
API client: 62.347996094 API transaction
204 8 Radhika Chippada
API client: 0.129231689 Parse response
205 8 Radhika Chippada
API client: 0.000315959 Prepare request https://qr1hi.arvadosapi.com/arvados/v1/links    
206 8 Radhika Chippada
API client: 0.247526392 API transaction
207 8 Radhika Chippada
API client: 0.000215447 Parse response
208 8 Radhika Chippada
API client: 0.000176321 Prepare request https://qr1hi.arvadosapi.com/arvados/v1/links    
209 8 Radhika Chippada
API client: 0.09771474 API transaction
210 8 Radhika Chippada
API client: 0.00015503 Parse response
211 8 Radhika Chippada
API client: 0.000181164 Prepare request https://qr1hi.arvadosapi.com/arvados/v1/links    
212 8 Radhika Chippada
API client: 0.088184468 API transaction
213 8 Radhika Chippada
API client: 0.00018323 Parse response
214 8 Radhika Chippada
Redirected to https://localhost:3031/collections/qr1hi-4zz18-cdwp2zwz1nj3uvh
215 8 Radhika Chippada
Completed 302 Found in 109373ms (ActiveRecord: 0.0ms)
216 8 Radhika Chippada
217 8 Radhika Chippada
218 8 Radhika Chippada
Started GET "/collections/qr1hi-4zz18-cdwp2zwz1nj3uvh" for 127.0.0.1 at 2015-05-11 22:16:44 -0400
219 8 Radhika Chippada
Processing by CollectionsController#show as HTML
220 8 Radhika Chippada
  Parameters: {"id"=>"qr1hi-4zz18-cdwp2zwz1nj3uvh"}
221 8 Radhika Chippada
API client: 0.000233106 Prepare request https://qr1hi.arvadosapi.com/arvados/v1/users/current    
222 8 Radhika Chippada
API client: 0.069534939 API transaction
223 8 Radhika Chippada
API client: 0.000270891 Parse response
224 1 Radhika Chippada
API client: 0.00174422 Prepare request https://qr1hi.arvadosapi.com/arvados/v1/collections/qr1hi-4zz18-cdwp2zwz1nj3uvh    
225 8 Radhika Chippada
API client: 17.157845279 API transaction
226 1 Radhika Chippada
API client: 0.13215165 Parse response
227 8 Radhika Chippada
...
228 9 Radhika Chippada
229 8 Radhika Chippada
</pre>
230 1 Radhika Chippada
231 8 Radhika Chippada
}}
232 1 Radhika Chippada
233 9 Radhika Chippada
h4. Proposed enhancements
234 8 Radhika Chippada
235 8 Radhika Chippada
* Offer an API server method that accepts the selections array (and optionally owner_uuid and name) and performs the creation of the new collection in the backend. Doing so can help as follows:
236 9 Radhika Chippada
** Using this API method, when combining entire collections: We can completely eliminate the need to fetch the manifest text for the collections in workbench. Also, workbench would no longer need to work through the combining logic and generate the manifest text for the new collection to be created. No need to do JSON decode and encode the manifest text. Lastly, it would not need to send this combined manifest text to the API server on the wire. Instead, the API server can do all these steps on the server and create the new collection and send the generated collection uuid to workbench (which will then reduce the performance issue down to collection#show issue; yay)
237 1 Radhika Chippada
** When combining selected files from within a collection: Here also, we can see significant performance improvements by eliminating need to generate the combined manifest text and sending it on wire.
238 9 Radhika Chippada
239 9 Radhika Chippada
* Avoid either one of (1) sending the new collection manifest_text after save from API server to workbench or (2) workbench retrieving it from API server during show. Doing both is wasteful and depending on the size of the new collection adds 10's of seconds of delay.
240 1 Radhika Chippada
241 1 Radhika Chippada
h3. 4. Implement caching using a framework such as Memcache
242 1 Radhika Chippada
243 1 Radhika Chippada
* One of the issues listed above (#5901) is around being able to access collection in multiple threads in parallel. Also, #5908 highlights several API requests being repeated within one page display. In fact, we have this issue in several areas of workbench implementation.
244 1 Radhika Chippada
245 1 Radhika Chippada
* By implementing caching, we will be able to reduce the need to make round trip API requests to fetch these objects. Instead, we can improve performance by fetching these objects from the shared cache. 
246 1 Radhika Chippada
247 1 Radhika Chippada
* Question: Not sure how caching would work if / when we cache these huge collections.
248 12 Peter Amstutz
249 12 Peter Amstutz
h3. 5. Implement collection update API
250 12 Peter Amstutz
251 12 Peter Amstutz
Use a better architecture for modifying collections on the API server that avoids transferring whole manifest text.  Will not affect performance of viewing collections.
252 12 Peter Amstutz
253 12 Peter Amstutz
[[Collection update API]]