Idea #14870
closed[API] Access logs from previous attempts after auto-retrying a container request
Description
Preserve all relevant logs in the container request's log collection, even if they span multiple containers.
Instead of just replacing the CR's entire log collection when the container's log is updated:- Copy the container's log files into a "container ${uuid}" subdir in the container request's log collection.
- Leave any existing "container ${uuid}" subdirs alone.
- Also put a copy of the latest container's logs in the root dir of the container request's log collection. This way, existing scripts continue to work on new logs.
(Aside: This also helps in the case where the container record itself is really what's wanted, since that is included in the container's log collection. There are currently some exceptions -- e.g., a log collection isn't created at all when a container doesn't fit any instance type -- but those could be fixed.)
Related issues
Updated by Tom Clegg almost 6 years ago
- Related to Feature #14706: [Crunch2] Retain references + permissions to earlier containers when retrying a container request added
Updated by Tom Morris almost 6 years ago
- Target version changed from To Be Groomed to 2019-02-27 Sprint
- Story points set to 2.0
Updated by Tom Morris over 5 years ago
- Target version changed from 2019-02-27 Sprint to Arvados Future Sprints
Updated by Tom Morris over 5 years ago
- Target version changed from Arvados Future Sprints to 2019-03-13 Sprint
Updated by Peter Amstutz over 5 years ago
- Status changed from New to In Progress
Updated by Peter Amstutz over 5 years ago
14870-ruby-sdk-cp-r @ 338ab239adbc259d5cd070158b4e571925b9f81b
The gist is that the ruby sdk seems to have a long standing bug where you can't copy into "." of an empty collection. It follows a different code path from the case where you are copying into a collection that already has something in it, as a result the existing test case "test_copy_root_contents_across_collections" didn't catch it.
Updated by Lucas Di Pentima over 5 years ago
As previously said on chat, 14870-ruby-sdk-cp-r LGTM. Thanks!
Updated by Peter Amstutz over 5 years ago
14870-retry-logs @ 6a240180171525077bc9e64e903b0122d5d5f1b4
https://ci.curoverse.com/view/Developer/job/developer-run-tests/1097/
- Logs for each container copied into subdirectory "container log for [uuid]". The most recent logs are also copied into the root of the collection when the container is finalized to minimize breaking existing code.
- Update tests
- just the uuid (no extra text)
- "log for attempt [uuid]"
- "failed attempt [uuid]"
Updated by Peter Amstutz over 5 years ago
Updating the Arvados Ruby SDK dependency creates an incidental problem, #14482 tightens up manifest handling but did not update the API server dependency. That's now causing problems:
ERROR: runTest (tests.test_mount.FuseMountTest) ---------------------------------------------------------------------- Traceback (most recent call last): File "/home/ci-jenkins/.jenkins-slave/workspace/developer-run-tests-services-fuse/services/fuse/tests/test_mount.py", line 91, in setUp self.api.collections().create(body={"manifest_text":cw.manifest_text()}).execute() File "/tmp/tmp.HDfRwOWIb9/VENVDIR/local/lib/python2.7/site-packages/googleapiclient/_helpers.py", line 130, in positional_wrapper return wrapped(*args, **kwargs) File "/tmp/tmp.HDfRwOWIb9/VENVDIR/local/lib/python2.7/site-packages/googleapiclient/http.py", line 840, in execute raise HttpError(resp, content, uri=self.uri) ApiError: <HttpError 422 when requesting https://0.0.0.0:43523/arvados/v1/collections?alt=json returned "Manifest text Manifest invalid for stream 5: invalid file token "4:1:\u0001\\"">
Updated by Lucas Di Pentima over 5 years ago
The changes LGTM. However, there's the pending FUSE issue. Just in case I did a complete test run: https://ci.curoverse.com/job/developer-run-tests/1101/
Updated by Peter Amstutz over 5 years ago
- Status changed from In Progress to Resolved