Project

General

Profile

Actions

Bug #3720

closed

[API] Fetching collection record is very slow when manifest is not small.

Added by Tom Clegg over 9 years ago. Updated over 9 years ago.

Status:
Resolved
Priority:
Normal
Assigned To:
Category:
API
Target version:
Story points:
0.5

Description

This makes nginx give up on Workbench and say "502 Bad Gateway".


Subtasks 1 (0 open1 closed)

Task #3806: Review 3720-big-manifest-performance-wipResolvedBrett Smith09/03/2014Actions

Related issues

Related to Arvados - Bug #3842: [Workbench] Manifests with more than one entry for a file display the file multiple times instead of concatinating themResolvedBrett SmithActions
Actions #1

Updated by Tom Clegg over 9 years ago

  • Description updated (diff)
  • Category set to API
Actions #4

Updated by Tom Clegg over 9 years ago

  • Target version set to 2014-09-17 sprint
Actions #5

Updated by Brett Smith over 9 years ago

  • Assigned To set to Brett Smith
Actions #6

Updated by Brett Smith over 9 years ago

  • Status changed from New to In Progress
Actions #7

Updated by Brett Smith over 9 years ago

This bug has both an API server component and a Workbench component. Here are the slow bits when I visit the Workbench page:

API client: 0.000437652 Prepare request https://qr1hi.arvadosapi.com/arvados/v1/collections/d071779966740d0f4884c17c3712a6a6+8369520
API client: 35.640100583 API transaction
API client: 11.322907415 request_time
API client: 0.157921615 Parse response

API client: 0.000361355 Prepare request https://qr1hi.arvadosapi.com/arvados/v1/logs  {"object_uuid":"d071779966740d0f4884c17c3712a6a6+8369520"}
API client: 6.25677107 API transaction
API client: 2.137570119 request_time
API client: 0.052077863 Parse response

Rendered collections/_show_files.html.erb (113004.2ms)
Actions #8

Updated by Brett Smith over 9 years ago

The vast majority of the API response time is in as_api_object. Here's some rough benchmarking added to my API server for Collections#show:

Entered Collections#show: 2014-08-28 17:30:25 -0400
Signed manifests: 0.126431064
Rendered Collection: 5.789607971
Completed 200 OK in 7442.9ms (Views: 5663.0ms | ActiveRecord: 77.5ms)

This Collection has almost 150K files, and part of the problem here is that we're rendering them twice: once in manifest_text, and once in the files attribute. Some brief grepping suggests that Workbench is the only likely user of the files attribute (FIXME: do better grepping). After discussion with Tom, here's our plan:

  • Remove the files attribute from API server.
  • Move the files generation code from API server to Workbench. (Really, it should go in SDKs, but we don't have a Ruby SDK, so…)
  • Because rendering the file list in Workbench itself also takes too long, cap the view at 10K files.
Actions #9

Updated by Brett Smith over 9 years ago

3720-big-manifest-performance-wip is up for review to implement a fix for this. There have been some changes from the previous plan, so to help understanding:

Brett Smith wrote:

  • Remove the files attribute from API server.

data_size went away along with this, because it also requires parsing the manifest, which can cause performance trouble.

  • Move the files generation code from API server to Workbench. (Really, it should go in SDKs, but we don't have a Ruby SDK, so…)

It turns out that the API server still needs to be able to parse file information from manifests in some cases—e.g., to find Docker images. As a consequence, I (1) moved the Locator class from the API server to the Ruby SDK, (2) added a Manifest class to the Ruby SDK to handle regular parsing tasks, and (3) used that code from both the API server and Workbench as needed.

There are many places where Workbench looks at the files attribute. For now, I have added a method for backwards compatibility to keep that code functioning. As needed, we can update these callers to use Manifest methods directly for better performance. (The current Manifest methods all yield results as they go, so if you only want to look at, e.g., the first three files, you can do that without parsing the entire text.) For now, my changes are sufficient to fix this story: the API server response is much quicker (~1 second compared to ~6 before), and the Workbench response is smaller to avoid problems after it receives the response.

I have also prepared changes for Jenkins to let us update the Ruby SDK and Rails apps at the same time, so that we'll no longer have a two-step update process like we currently do with arvados-cli.

  • Because rendering the file list in Workbench itself also takes too long, cap the view at 10K files.

This is done with a simple take call where needed.

Actions #10

Updated by Peter Amstutz over 9 years ago

LGTM

Actions #11

Updated by Brett Smith over 9 years ago

  • Status changed from In Progress to Resolved
  • % Done changed from 0 to 100

Applied in changeset arvados|commit:41c7f2010ebdbb76fada25a21f184e2d1f4049b3.

Actions

Also available in: Atom PDF