Bug #22124
closedCannot upload data through keep-web (workbench)
Description
Since the #21773 merge to main
, our test-provision-*
Jenkins jobs started to fail (for example: test-provision-debian12: #109 ). I've tried deploying a test cluster with the latest dev packages and got the issue of not being able to upload files through keep-web
or keepproxy
, although was able to download data just fine either through keep-web
(using Workbench) or keepproxy
(using arv-get
from the outside)
Uploading data directly through keep (i.e.: using arv-put
from the shell node) works correctly, so it seems the issue is related to server components that act as keep clients.
Also, tried uploading a file to tordo
using Workbench, and got an error too.
While doing an upload through keep-web
, I captured both keep-web
and keepstore
logs:
Keep-web¶
Sep 12 18:47:13 workbench keep-web[15367]: {"ClusterID":"xarv4","PID":15367,"RequestID":"req-z94iriqcntir1ivaiwry","level":"info","msg":"request","remoteAddr":"10.1.1.15:48972","reqBytes":0,"reqForwardedFor":"123.123.123.123","reqHost":"download.xarv4.example.com","reqMethod":"OPTIONS","reqPath":"c=xarv4-4zz18-1npep8rzy4f784i/report.pdf","reqQuery":"","time":"2024-09-12T18:47:13.988937205Z"} Sep 12 18:47:13 workbench keep-web[15367]: {"ClusterID":"xarv4","PID":15367,"RequestID":"req-z94iriqcntir1ivaiwry","level":"info","msg":"response","priority":1,"queue":"api","remoteAddr":"10.1.1.15:48972","reqBytes":0,"reqForwardedFor":"123.123.123.123","reqHost":"download.xarv4.example.com","reqMethod":"OPTIONS","reqPath":"c=xarv4-4zz18-1npep8rzy4f784i/report.pdf","reqQuery":"","respBytes":0,"respStatus":"OK","respStatusCode":200,"time":"2024-09-12T18:47:13.989037619Z","timeToStatus":0.000095,"timeTotal":0.000095,"timeWriteBody":0.000000} Sep 12 18:47:14 workbench keep-web[15367]: {"ClusterID":"xarv4","PID":15367,"RequestID":"req-178lb5z57die9p8gz5ti","level":"info","msg":"request","remoteAddr":"10.1.1.15:48982","reqBytes":10718439,"reqForwardedFor":"123.123.123.123","reqHost":"download.xarv4.example.com","reqMethod":"PUT","reqPath":"c=xarv4-4zz18-1npep8rzy4f784i/report.pdf","reqQuery":"","time":"2024-09-12T18:47:14.161948403Z"} Sep 12 18:47:14 workbench keep-web[15367]: {"ClusterID":"xarv4","PID":15367,"RequestID":"req-178lb5z57die9p8gz5ti","collection_file_path":"report.pdf","collection_uuid":"xarv4-4zz18-1npep8rzy4f784i","level":"info","msg":"File upload","remoteAddr":"10.1.1.15:48982","reqBytes":10718439,"reqForwardedFor":"123.123.123.123","reqHost":"download.xarv4.example.com","reqMethod":"PUT","reqPath":"c=xarv4-4zz18-1npep8rzy4f784i/report.pdf","reqQuery":"","time":"2024-09-12T18:47:14.212947790Z","user_full_name":"Lucas Di Pentima","user_uuid":"xarv4-tpzed-3ma928fbhnnejs2"} Sep 12 18:47:43 workbench keep-web[15367]: {"ClusterID":"xarv4","PID":15367,"RequestID":"req-178lb5z57die9p8gz5ti","error":"sync failed: Could not write sufficient replicas: [405] method not allowed","level":"error","msg":"update() returned *errors.errorString error, changing response to HTTP 500","remoteAddr":"10.1.1.15:48982","reqBytes":10718439,"reqForwardedFor":"123.123.123.123","reqHost":"download.xarv4.example.com","reqMethod":"PUT","reqPath":"c=xarv4-4zz18-1npep8rzy4f784i/report.pdf","reqQuery":"","time":"2024-09-12T18:47:43.191583205Z"} Sep 12 18:47:43 workbench keep-web[15367]: {"ClusterID":"xarv4","PID":15367,"RequestID":"req-178lb5z57die9p8gz5ti","level":"info","msg":"response","priority":1,"queue":"api","remoteAddr":"10.1.1.15:48982","reqBytes":10718439,"reqForwardedFor":"123.123.123.123","reqHost":"download.xarv4.example.com","reqMethod":"PUT","reqPath":"c=xarv4-4zz18-1npep8rzy4f784i/report.pdf","reqQuery":"","respBody":"sync failed: Could not write sufficient replicas: [405] method not allowed\n","respBytes":75,"respStatus":"Internal Server Error","respStatusCode":500,"time":"2024-09-12T18:47:43.191700379Z","timeToStatus":29.029711,"timeTotal":29.029750,"timeWriteBody":0.000038}
Keepstore¶
Sep 12 18:47:43 keep0 keepstore[7895]: {"ClusterID":"xarv4","PID":7895,"RequestID":"req-19bpla8nn24vr1b8rj5g","level":"info","msg":"request","remoteAddr":"10.1.1.15:54916","reqBytes":10718439,"reqForwardedFor":"","reqHost":"10.1.2.13:25107","reqMethod":"PUT","reqPath":"/23e12771d170009f23c89b40481718f3","reqQuery":"","time":"2024-09-12T18:47:43.190142794Z"} Sep 12 18:47:43 keep0 keepstore[7895]: {"ClusterID":"xarv4","PID":7895,"RequestID":"req-19bpla8nn24vr1b8rj5g","level":"info","msg":"response","priority":0,"queue":"api","remoteAddr":"10.1.1.15:54916","reqBytes":10718439,"reqForwardedFor":"","reqHost":"10.1.2.13:25107","reqMethod":"PUT","reqPath":"/23e12771d170009f23c89b40481718f3","reqQuery":"","respBytes":0,"respStatus":"Moved Permanently","respStatusCode":301,"time":"2024-09-12T18:47:43.190257435Z","timeToStatus":0.000103,"timeTotal":0.000107,"timeWriteBody":0.000004} Sep 12 18:47:43 keep0 keepstore[7895]: {"ClusterID":"xarv4","PID":7895,"RequestID":"req-19bpla8nn24vr1b8rj5g","level":"info","msg":"request","remoteAddr":"10.1.1.15:54930","reqBytes":0,"reqForwardedFor":"","reqHost":"10.1.2.13:25107","reqMethod":"GET","reqPath":"23e12771d170009f23c89b40481718f3","reqQuery":"","time":"2024-09-12T18:47:43.191265893Z"} Sep 12 18:47:43 keep0 keepstore[7895]: {"ClusterID":"xarv4","PID":7895,"RequestID":"req-19bpla8nn24vr1b8rj5g","level":"info","msg":"response","priority":0,"queue":"api","remoteAddr":"10.1.1.15:54930","reqBytes":0,"reqForwardedFor":"","reqHost":"10.1.2.13:25107","reqMethod":"GET","reqPath":"23e12771d170009f23c89b40481718f3","reqQuery":"","respBody":"method not allowed\n","respBytes":19,"respStatus":"Method Not Allowed","respStatusCode":405,"time":"2024-09-12T18:47:43.191373814Z","timeToStatus":0.000093,"timeTotal":0.000099,"timeWriteBody":0.000006}
Notes¶
- As a first observation, I didn't found any
req-id
matches between keep-web and keepstore, that might be a sign of a bug, since req ids should be preserved. - Keepstore's response to the PUT request was a 301, not sure that's correct.
- Keep-web's subsequent GET request gets a 405 (method not allowed?) response from Keepstore.
Not sure why the changes made in #21822 would cause these behaviors.
Updated by Lucas Di Pentima 3 months ago
- Related to Feature #21773: keep-web and keepproxy should use Services.Keepstore.InternalURLs configs directly instead of requesting /arvados/v1/keep_services/accessible added
Updated by Tom Clegg 3 months ago
22124-keepstore-put-moved @ 482412d27c0b25bc8fbe57ade0327dbff7f2c840 -- developer-run-tests: #4446
Pretty sure this will fix it.
I'd like to add some more testing, and turn off the redirect-if-double-slash behavior for keepstore, but in the meantime we could merge this and confirm it fixes test-provision.
Updated by Lucas Di Pentima 3 months ago
LGTM, and agree it would be convenient to add more testing to catch this kind of issues earlier. Thanks!
Updated by Tom Clegg 3 months ago
- Trim trailing slashes in ARVADOS_KEEP_SERVICES env var.
- Reject requests with double slash, except GET //locator.
Longer explanation, from commit message:
22124: Reject requests with double slash, except GET //locator. Previously we responded to "PUT //locator" with a 301 to a cleaned path, which did not work because 301 tells clients to repeat the PUT request as a GET. Better to just respond 400, since this means the client is broken or misconfigured. There may be some clients that have been working successfully with the 301 redirect behavior because they only do GET and HEAD requests. To avoid breaking these, we continue to accept "GET //locator" -- but now we just return the data instead of a redirect to "GET /locator".
Updated by Tom Clegg 3 months ago
- Status changed from In Progress to Resolved
Applied in changeset arvados|9ad1d7a437b536b920c1ea86d6ca3efd7158f250.