Bug #16392

Updated by Ward Vandewege over 1 year ago

Our dev cluster, `ce8i5`, had this in its `config.yml` file:

<pre>
Services:
Keepstore:
InternalURLs:
"http://10.47.0.6:25107/": {}
</pre>

Since feature #16328, keepproxy now uses those entries instead of trying to discover keepstores via the api server.

It turns out that the trailing slash in the InternalURLs entry causes problems. This is what happens on the keepstore in question:

<pre>
May 01 01:02:54 keep0.ce8i5.arvadosapi.com keepstore[42411]: {"PID":42411,"RequestID":"req-ck4ozsdn8womgzwr5fva","level":"info","msg":"request","remoteAddr":"10.47.0.5:39818","reqBytes":24,"reqForwardedFor":"","reqHost":"10.47.0.6:25107","reqMethod":"PUT","reqPath":"/2ad1390f7b534cf6b25765606796626a","reqQuery":"","time":"2020-05-01T01:02:54.680169064Z"}
May 01 01:02:54 keep0.ce8i5.arvadosapi.com keepstore[42411]: {"PID":42411,"RequestID":"req-ck4ozsdn8womgzwr5fva","level":"info","msg":"response","remoteAddr":"10.47.0.5:39818","reqBytes":24,"reqForwardedFor":"","reqHost":"10.47.0.6:25107","reqMethod":"PUT","reqPath":"/2ad1390f7b534cf6b25765606796626a","reqQuery":"","respBytes":0,"respStatus":"Moved Permanently","respStatusCode":301,"time":"2020-05-01T01:02:54.680293465Z","timeToStatus":0.000129,"timeTotal":0.000132,"timeWriteBody":0.000002}
May 01 01:02:54 keep0.ce8i5.arvadosapi.com keepstore[42411]: {"PID":42411,"RequestID":"req-ck4ozsdn8womgzwr5fva","level":"info","msg":"request","remoteAddr":"10.47.0.5:39818","reqBytes":0,"reqForwardedFor":"","reqHost":"10.47.0.6:25107","reqMethod":"GET","reqPath":"2ad1390f7b534cf6b25765606796626a","reqQuery":"","time":"2020-05-01T01:02:54.687455786Z"}
May 01 01:02:54 keep0.ce8i5.arvadosapi.com keepstore[42411]: {"PID":42411,"RequestID":"req-ck4ozsdn8womgzwr5fva","level":"info","msg":"response","remoteAddr":"10.47.0.5:39818","reqBytes":0,"reqForwardedFor":"","reqHost":"10.47.0.6:25107","reqMethod":"GET","reqPath":"2ad1390f7b534cf6b25765606796626a","reqQuery":"","respBody":"Forbidden\n","respBytes":10,"respStatus":"Forbidden","respStatusCode":403,"time":"2020-05-01T01:02:54.687560487Z","timeToStatus":0.000094,"timeTotal":0.000100,"timeWriteBody":0.000006}
</pre>

And of course keepproxy then relays the familiar, but not so informative error to the client:

<pre>
Traceback (most recent call last):
File "/usr/bin/arv-put", line 7, in <module>
main()
File "/usr/share/python2.7/dist/python-arvados-python-client/lib/python2.7/site-packages/arvados/commands/put.py", line 1304, in main
writer.start(save_collection=not(args.stream or args.raw))
File "/usr/share/python2.7/dist/python-arvados-python-client/lib/python2.7/site-packages/arvados/commands/put.py", line 628, in start
self._local_collection.manifest_text()
File "/usr/share/python2.7/dist/python-arvados-python-client/lib/python2.7/site-packages/arvados/arvfile.py", line 270, in synchronized_wrapper
return orig_func(self, *args, **kwargs)
File "/usr/share/python2.7/dist/python-arvados-python-client/lib/python2.7/site-packages/arvados/collection.py", line 1014, in manifest_text
self._my_block_manager().commit_all()
File "/usr/share/python2.7/dist/python-arvados-python-client/lib/python2.7/site-packages/arvados/arvfile.py", line 816, in commit_all
raise KeepWriteError("Error writing some blocks", err, label="block")
arvados.errors.KeepWriteError: Error writing some blocks: block 2ad1390f7b534cf6b25765606796626a+24 raised KeepWriteError (failed to write 2ad1390f7b534cf6b25765606796626a after 2 attempts (wanted 2 copies but wrote 0): service https://keep.ce8i5.arvadosapi.com:443/ responded with 413 HTTP/1.1 100 Continue
HTTP/1.1 413 Request Entity Too Large)
</pre>

Removing the trailing slash from the InternalURLs entry resolved the problem.

It would probably be most user friendly if keepproxy (or our config parsing library?) can handle entries both with and without trailing slash. We've named the field `InternalURLs`, so we had this one coming.

Back