Feature #17217
closed[controller] move blob signature calculation from api to controller
Description
Starting a lot of nodes that all read from the same large collection overloads the api server, with tons of ruby processes fighting over cpu. This happened on su92l today. It is possible that the blog signature calculation is the culprit. It would not be hard to move that code to controller.
Should also add a test case that puts different kinds of large manifests (lots of blocks, lots of files, lots of lines) through both controller+railsapi.
Updated by Tom Clegg about 4 years ago
Suspected example: failed container request su92l-xvhdp-ro44bqknjjbkpdv, log su92l-4zz18-s0iat32namtr51h
2020-12-17T16:02:29.440839471Z /mnt/su92l-4zz18-9ghkf2xpro0ff9q/NA19732.haplotypeCalls.er.raw.vcf.gz phase 1: request failed: https://su92l.arvadosapi.com:443/arvados/v1/collections/su92l-4zz18-9ghkf2xpro0ff9q?select=%5B%22uuid%22%2C%22manifest_text%22%5D: 503 Service Unavailable: request failed: http://localhost:8000/arvados/v1/collections/su92l-4zz18-9ghkf2xpro0ff9q?select=%5B%22uuid%22%2C%22manifest_text%22%5D: 503 Service Unavailable
Updated by Tom Clegg almost 4 years ago
- Target version set to To Be Groomed
- Subject changed from [controller] move blog signature calculation from api to controller to [controller] move blob signature calculation from api to controller
Updated by Tom Clegg over 3 years ago
- Blocked by Feature #17531: [controller] Remove ForceLegacyAPI14 config flag added
Updated by Tom Clegg over 3 years ago
- Assigned To set to Tom Clegg
- Status changed from New to In Progress
Updated by Tom Clegg over 3 years ago
some work in progress on 17217-collection-signatures
Updated by Peter Amstutz over 3 years ago
- Target version deleted (
To Be Groomed)
Updated by Tom Clegg over 3 years ago
- RailsAPI returns unsigned manifests (signing code still exists only for testing purposes)
- Controller signs manifests returned by create/update/get/list calls
- As before, manifests returned by groups#contents are not signed
- Removed a bunch of unused code from controller's "old code path" (not strictly necessary, but it made it easier to confirm that the old code path wasn't actually used and didn't need to be updated)
Updated by Lucas Di Pentima over 3 years ago
It would be interesting to have a performance comparison, but I guess it isn't trivial to make if the issue presents on production-level workloads.
LGTM, thanks!
Updated by Tom Clegg over 3 years ago
- Status changed from In Progress to Resolved
Applied in changeset arvados|d8e3a67d508e9a5f5c01884259c0e75a140f64e9.