Bug #22005
closedcontroller integration test fails with s3cmd 2.3.0+
Description
This looks like an incompatibility with a new version of s3cmd:
integration_test.go:371: c.Check(string(buf), check.Matches, `(?ms).*`+flen+` (bytes in|of `+flen+`).*`) ... value string = "" + ... "\n" + ... "!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!\n" + ... " An unexpected error has occurred.\n" + ... " Please try reproducing the error using\n" + ... " the latest s3cmd code from the git master\n" + ... " branch found at:\n" + ... " https://github.com/s3tools/s3cmd\n" + ... " and have a look at the known issues list:\n" + ... " https://github.com/s3tools/s3cmd/wiki/Common-known-issues-and-their-solutions-(FAQ)\n" + ... " If the error persists, please report the\n" + ... " following lines (removing any private\n" + ... " info as necessary) to:\n" + ... " s3tools-bugs@lists.sourceforge.net\n" + ... "\n" + ... "\n" + ... "!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!\n" + ... "\n" + ... "Invoked as: /usr/bin/s3cmd --ssl --no-check-certificate --host=127.0.0.33:45895 --host-bucket=127.0.0.33:45895 --access_key=v2_z1111-gj3su-muap79xge40ewgb_2q871k3h9o1ra5v2ptka7i3whummfkyl7ljng491z9w3prej07 --secret_key=v2_z1111-gj3su-muap79xge40ewgb_2q871k3h9o1ra5v2ptka7i3whummfkyl7ljng491z9w3prej07 get s3://z3333-4zz18-xnc1q98h6upyzg5/test.txt /tmp/check-1037300014/5/tmpfile\n" + ... "Problem: <class 'KeyError: 'etag'\n" + ... "S3cmd: 2.3.0\n" + ... "python: 3.11.2 (main, May 2 2024, 11:59:08) [GCC 12.2.0]\n" + ... "environment LANG=en_US.UTF-8\n" + ... "\n" + ... "Traceback (most recent call last):\n" + ... " File \"/usr/bin/s3cmd\", line 3286, in <module>\n" + ... " rc = main()\n" + ... " ^^^^^^\n" + ... " File \"/usr/bin/s3cmd\", line 3183, in main\n" + ... " rc = cmd_func(args)\n" + ... " ^^^^^^^^^^^^^^\n" + ... " File \"/usr/bin/s3cmd\", line 538, in cmd_object_get\n" + ... " remote_list, exclude_list, remote_total_size = fetch_remote_list(\n" + ... " ^^^^^^^^^^^^^^^^^^\n" + ... " File \"/usr/lib/python3/dist-packages/S3/FileLists.py\", line 508, in fetch_remote_list\n" + ... " _get_remote_attribs(uri, remote_item)\n" + ... " File \"/usr/lib/python3/dist-packages/S3/FileLists.py\", line 379, in _get_remote_attribs\n" + ... " 'md5': response['headers']['etag'].strip('\"\\''),\n" + ... " ~~~~~~~~~~~~~~~~~~~^^^^^^^^\n" + ... "KeyError: 'etag'\n" + ... "\n" + ... "!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!\n" + ... " An unexpected error has occurred.\n" + ... " Please try reproducing the error using\n" + ... " the latest s3cmd code from the git master\n" + ... " branch found at:\n" + ... " https://github.com/s3tools/s3cmd\n" + ... " and have a look at the known issues list:\n" + ... " https://github.com/s3tools/s3cmd/wiki/Common-known-issues-and-their-solutions-(FAQ)\n" + ... " If the error persists, please report the\n" + ... " above lines (removing any private\n" + ... " info as necessary) to:\n" + ... " s3tools-bugs@lists.sourceforge.net\n" + ... "!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!\n" + ... "\n" ... regex string = "(?ms).*41 (bytes in|of 41).*" FAIL: integration_test.go:302: IntegrationSuite.TestS3WithFederatedToken
Updated by Tom Clegg 4 months ago
Checked the diff between s3cmd v2.1.0 and v2.3.0 and found this commit requiring the server to provide an Etag header.
Updated by Brett Smith 4 months ago
- Subject changed from controller integration test fails on debian 12 to controller integration test fails with s3cmd 2.3.0+
If this is really just about the version of s3cmd, then it makes sense and it's fine that we found out from a new Debian release, but the Debian version isn't really the important part. Developers today can work around by installing s3cmd 2.1.0 in a virtualenv. Developers on Debian 11 could trip up on this by installing s3cmd 2.3.0+ in a virtualenv.
Updated by Tom Clegg 4 months ago
I see this as an s3 compatibility issue as well as a testing issue: it's reasonable for s3 clients to assume the server sets an Etag header, and it's reasonable for users to expect those clients (including the version of s3cmd that ships with their distro) to work with Arvados. So, if it's not too big a deal, I think we should update Arvados rather than pin tests to an old version of s3cmd.
Confirmed the new version of s3cmd works if we return the collection PDH in the Etag response header for s3 file downloads.
22005-s3-etag @ 3aae495dd2053818e2ea916b520484d0246ed747 -- developer-run-tests: #4353
Updated by Brett Smith 4 months ago
Tom Clegg wrote in #note-4:
22005-s3-etag @ 3aae495dd2053818e2ea916b520484d0246ed747 -- developer-run-tests: #4353
Makes sense and LGTM, thanks. Tested by specifically installing latest s3cmd in the test virtualenv:
% ~/.cache/arvados-test/VENV3DIR/bin/pip install "s3cmd~=2.3" […] Successfully installed python-dateutil-2.9.0.post0 python-magic-0.4.27 s3cmd-2.4.0 % arvtest services/keep-web […] PATH is /home/brett/.cache/arvados-test/VENV3DIR/bin:/home/brett/.cache/arvados-test/GEMHOME/.local/share/gem/ruby/3.2.0/bin:/home/brett/ .local/lib/rubygems/bin:/home/brett/.cache/arvados-test/GOPATH/bin:/var/lib/arvados/bin:/usr/bin:/usr/sbin:/home/brett/.local/bin:/usr/local/bin:/usr/local/sbin ======= install env […] ======= test services/keep-web ok git.arvados.org/arvados.git/services/keep-web 74.150s coverage: 88.2% of statements ======= test services/keep-web -- 75s […] Pass: services/keep-web tests (75s) All test suites passed. Leaving behind temp dirs in /home/brett/.cache/arvados-test
Updated by Tom Clegg 4 months ago
- Status changed from In Progress to Resolved
Applied in changeset arvados|f0011acaa2a70d919b36a2cc1a37cd2c605c23e2.