Story #7824

Updated by Brett Smith almost 6 years ago

h2. Original report

In 1.5 hrs, 8MiB of a 55MiB file was downloaded using the command: arv keep get 215dd32873bfa002aa0387c6794e4b2c+54081534/tile.csv .

A top on the computer running the "arv keep get" command results in:
top - 19:47:07 up 2 days, 9:09, 8 users, load average: 1.12, 1.26, 1.32
Tasks: 223 total, 3 running, 217 sleeping, 0 stopped, 3 zombie
%Cpu(s): 43.5 us, 8.7 sy, 0.0 ni, 47.8 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
KiB Mem: 15535256 total, 12281116 used, 3254140 free, 1069760 buffers
KiB Swap: 15929340 total, 221892 used, 15707448 free. 5467732 cached Mem

14366 sguthrie 20 0 2498672 2.173g 7204 R 100.0 14.7 98:02.16 arv-get

Downloads from workbench on this collection generate a timeout before allowing the user to choose where to download the file.

Story #7729 requires multiple downloads from this qr1hi collection (qr1hi-4zz18-wuld8y0z7qluw00) and ones with similarly large manifests. To unblock #7729 I would need one of:
* A recipe that allows a user to alter the manifest to be well behaved
* Faster downloads from collections with very large manifests

Update by Ward:

I investigated a bit while this was ongoing. There was no discernable extra load on keepproxy, or on the API server, or on Postgres while Sally's download was ongoing. But when I tried to run the command locally, after a while I saw arv-get suck up 100% cpu (one core) and peak ram at 3GiB (resident!) until I killed it.

h2. Fix

Update arv-get to get files from collections using the Python file API, which is better optimized in the SDK than the old CollectionReader API. See the code in note 3 #7824-3 for the basic gist.