Story #6092

[TBD] Improve the performance of the worst-performing collections component

Added by Brett Smith over 4 years ago. Updated over 4 years ago.

Status:
Resolved
Priority:
Normal
Assigned To:
Radhika Chippada
Category:
-
Target version:
Start date:
Due date:
% Done:

0%

Estimated time:
Story points:
2.0

Related issues

Related to Arvados - Story #6203: [API] [Performance] Optimize time spent on the API server side during a large collection creation.Resolved05/25/201505/25/2015

Associated revisions

Revision 8b658d96
Added by Tom Clegg over 4 years ago

Merge branch '6087-collection-timing' (early part) refs #6087 refs #6092

History

#1 Updated by Brett Smith over 4 years ago

  • Tracker changed from Bug to Story

This can't be done until we have results from #6061.

#2 Updated by Radhika Chippada over 4 years ago

  • Assigned To set to Radhika Chippada

#3 Updated by Radhika Chippada over 4 years ago

Based on the extensive amount of profiling of collection performance, it appears that below are some of the big areas of potential performance improvement:

  • Performance improvements in collections/_show_files page: #6050
  • Performance improvements in collection#show page source_summary section: #6042
  • Fetching same collection repeatedly
    • #6041
    • In case of an combine collections, the collections being combined are again fetched in the combine_selected_files_into_collection action. After combining, new collection is returned by server; however, the redirect operation again fetches the newly created collection during show operation

#4 Updated by Brett Smith over 4 years ago

Radhika Chippada wrote:

Based on the extensive amount of profiling of collection performance, it appears that below are some of the big areas of potential performance improvement:

I would point out that we haven't done any performance testing on the Python SDK or FUSE yet, both of which are common ways for jobs to access and manipulate collections. I am reluctant to prioritize any changes to Workbench until we have some numbers for those. In general, I would rather prioritize changes that improve job performance and stability to those that don't.

I agree that improving the performance of access token parsing and manipulation seems like a great place to start. It affects literally every operation on collections, and will improve performance for all components.

I will move #6203 to this sprint and assign to you, as the instantiation of this ticket. I will leave this ticket open for now. If you finish #6203 and your other tickets on this sprint promptly, let's revisit to see if there's another good change to make in the time remaining.

Thanks.

#5 Updated by Radhika Chippada over 4 years ago

  • Status changed from New to In Progress

#6 Updated by Radhika Chippada over 4 years ago

  • Status changed from In Progress to Resolved

Implemented api performance improvements in #6203

Also available in: Atom PDF