Bug #5426
closed[Workbench] Large downloads through workbench fail
Description
Right around 1 GiB, this download fails (notice that it fails at two different positions, but the same position the last 2 times...)
--2015-03-10 09:03:52-- https://workbench.qr1hi.arvadosapi.com/collections/download/qr1hi-4zz18-b1uuzkf11kg3huv/3yfrrbhnsh4t1qyr8catlfa5q8uy2m7wscuvdrm4d485hqgy9u/lobstr_v3.0.2_hg19_ref/lobSTR_ref.fasta Reusing existing connection to workbench.qr1hi.arvadosapi.com:443. HTTP request sent, awaiting response... 200 OK Length: unspecified [application/octet-stream] Saving to: ‘lobstr_v3.0.2_hg19_ref/lobSTR_ref.fasta’ lobstr_v3.0.2_hg19_ [ <=> ] 1.01G 2.98MB/s in 6m 17s Last-modified header missing -- time-stamps turned off. 2015-03-10 09:10:10 (2.74 MB/s) - Read error at byte 1083196276 (The request is invalid.).Retrying. --2015-03-10 09:10:11-- (try: 2) https://workbench.qr1hi.arvadosapi.com/collections/download/qr1hi-4zz18-b1uuzkf11kg3huv/3yfrrbhnsh4t1qyr8catlfa5q8uy2m7wscuvdrm4d485hqgy9u/lobstr_v3.0.2_hg19_ref/lobSTR_ref.fasta Connecting to workbench.qr1hi.arvadosapi.com (workbench.qr1hi.arvadosapi.com)|54.88.31.97|:443... connected. HTTP request sent, awaiting response... 200 OK Length: unspecified [application/octet-stream] Saving to: ‘lobstr_v3.0.2_hg19_ref/lobSTR_ref.fasta’ lobstr_v3.0.2_hg19_ [ <=> ] 1.01G 3.05MB/s in 6m 34s 2015-03-10 09:16:47 (2.63 MB/s) - Read error at byte 1084238716 (The request is invalid.).Retrying. --2015-03-10 09:16:49-- (try: 3) https://workbench.qr1hi.arvadosapi.com/collections/download/qr1hi-4zz18-b1uuzkf11kg3huv/3yfrrbhnsh4t1qyr8catlfa5q8uy2m7wscuvdrm4d485hqgy9u/lobstr_v3.0.2_hg19_ref/lobSTR_ref.fasta Connecting to workbench.qr1hi.arvadosapi.com (workbench.qr1hi.arvadosapi.com)|54.88.31.97|:443... connected. HTTP request sent, awaiting response... 200 OK Length: unspecified [application/octet-stream] Saving to: ‘lobstr_v3.0.2_hg19_ref/lobSTR_ref.fasta’ lobstr_v3.0.2_hg19_ [ <=> ] 1.01G 1.98MB/s in 7m 59s 2015-03-10 09:24:50 (2.16 MB/s) - Read error at byte 1084238716 (The request is invalid.).Retrying.
Related issues
Updated by Tom Clegg over 9 years ago
- Subject changed from [Keep] Large downloads through workbench fail to [Workbench] Large downloads through workbench fail
- Category set to Workbench
- Could this be a proxy issue? (Try bypassing nginx and downloading from Workbench directly, from inside the firewall?)
- Anything in Workbench logs?
- Anything in nginx logs?
- Confirmed there's no problem retrieving the entire file with other tools?
Updated by Ward Vandewege over 9 years ago
- Category deleted (
Workbench)
The bug appears to be in our code. Workbench does a fork (IO.popen) to call arv-get and streams the files. Nginx says in the logs:
2015/03/10 13:21:07 [error] 5544#0: *395704 upstream prematurely closed connection while reading upstream, client: 74.118.24.162, server: workbench.qr1hi.arvadosapi.com, request: "GET /collections/download/qr1hi-4zz18-b1uuzkf11kg3huv/3yfrrbhnsh4t1qyr8catlfa5q8uy2m7wscuvdrm4d485hqgy9u/lobstr_v3.0.2_hg19_ref/lobSTR_ref.fasta HTTP/1.1", upstream: "http://127.0.0.1:9000/collections/download/qr1hi-4zz18-b1uuzkf11kg3huv/3yfrrbhnsh4t1qyr8catlfa5q8uy2m7wscuvdrm4d485hqgy9u/lobstr_v3.0.2_hg19_ref/lobSTR_ref.fasta", host: "workbench.qr1hi.arvadosapi.com", referrer: "https://workbench.qr1hi.arvadosapi.com/collections/download/qr1hi-4zz18-b1uuzkf11kg3huv/3yfrrbhnsh4t1qyr8catlfa5q8uy2m7wscuvdrm4d485hqgy9u/"
There is nothing in the nginx error log for the process running on port 9000.
So, it looks like the IO.popen dies or arv-get dies, without logging anything in the webserver logs. This happens reliably at sizes just over 1 GiB.
Updated by Peter Amstutz over 9 years ago
- Target version changed from Bug Triage to 2015-04-01 sprint
Updated by Peter Amstutz over 9 years ago
- Status changed from New to In Progress
Updated by Ward Vandewege over 9 years ago
- Target version changed from 2015-04-01 sprint to 2015-04-29 sprint
Updated by Ward Vandewege over 9 years ago
- Status changed from In Progress to Resolved
Updated by Ward Vandewege over 9 years ago
- Status changed from Resolved to In Progress
- Target version changed from 2015-04-29 sprint to Bug Triage
I'm re-opening this bug.
The collection mentioned above downloads fine, now that we have proxy_buffering disabled. That's roughtly 1.8 GiB.
However - this collection (qr1hi-4zz18-w0t3gbd4u8n5o9h) has a 16.4 fasta file in it, and.... it terminates download after roughly 1 GiB when downloaded through the browser. With arv keep get on a shell node, we get all 16.4 GiB without issues.
Updated by Radhika Chippada over 9 years ago
Actually, it appears that the original download listed in the ticket also does not download completely. It terminates after 1.08GB and not all of the 1.8GB is downloaded.
Updated by Brett Smith over 9 years ago
- Assigned To deleted (
Peter Amstutz) - Target version changed from Bug Triage to Deferred
I think we're very likely to deal with this via #5824.
Updated by Peter Amstutz over 7 years ago
- Status changed from In Progress to Resolved
Now using keep-web and planning to remove workbench download entirely.