Actions
Bug #7971
closedPython SDK Keep timeouts on su92l are too agressive
Status:
Closed
Priority:
Normal
Assigned To:
-
Category:
-
Target version:
-
Story points:
-
Description
https://workbench.su92l.arvadosapi.com/pipeline_instances/su92l-d1hrv-gk11bugx763h0tl#Log
I tried downloading the files the jobs are expected to read - they downloaded fine.
I cancelled the job after the first 10 or so failures to avoid wasting compute time.
Updated by Nico César about 9 years ago
the actual issue is 'Operation timed out after 300000 milliseconds with 0 bytes received' on keep15... investigating
Updated by Nico César about 9 years ago
so the data is there and says that it took 1257.834175s to transfer it
keep15.su92l:/home/nico# grep 64be48c8afb7800c007856fb2ea1a6fb /etc/sv/keepstore/log/main/current 2015-12-08_18:59:19.92647 2015/12/08 18:59:19 [10.28.64.30:58798] GET 64be48c8afb7800c007856fb2ea1a6fb+8645645+A392c53cccd46d057477d854575ebb42ae6423815@5679989c 1257.834175s 200 8645645 "OK" keep15.su92l:/home/nico# ls /data/su92l-keep-*/keep/64b/64be48c8afb7800c007856fb2ea1a6fb /data/su92l-keep-4/keep/64b/64be48c8afb7800c007856fb2ea1a6fb keep15.su92l:/home/nico# md5sum /data/su92l-keep-4/keep/64b/64be48c8afb7800c007856fb2ea1a6fb 64be48c8afb7800c007856fb2ea1a6fb /data/su92l-keep-4/keep/64b/64be48c8afb7800c007856fb2ea1a6fb
Updated by Nico César about 9 years ago
everything seems ok now
keep15.su92l:/home/nico# time md5sum /data/su92l-keep-4/keep/db7/db7850a4a0c42aaa354f41bcab05f7a8 /data/su92l-keep-4/keep/0e4/0e4c80cb8017e52812aed9dbad71a6d1 /data/su92l-keep-4/keep/66e/66ead77af94bacbd5b96365412d601dd db7850a4a0c42aaa354f41bcab05f7a8 /data/su92l-keep-4/keep/db7/db7850a4a0c42aaa354f41bcab05f7a8 0e4c80cb8017e52812aed9dbad71a6d1 /data/su92l-keep-4/keep/0e4/0e4c80cb8017e52812aed9dbad71a6d1 66ead77af94bacbd5b96365412d601dd /data/su92l-keep-4/keep/66e/66ead77af94bacbd5b96365412d601dd real 0m0.220s user 0m0.176s sys 0m0.040s
should we make some kind of analysis on the logs for requests > 300s ??
Updated by Ward Vandewege about 9 years ago
- Subject changed from Python SDK on su92l raises KeepReadError: failed to read [...] service [...] responded with 404 HTTP/1.1 404 Not Found to Python SDK Keep timeouts on su92l are too agressive
Actions