Project

General

Profile

Actions

Idea #14870

closed

[API] Access logs from previous attempts after auto-retrying a container request

Added by Tom Clegg about 5 years ago. Updated about 5 years ago.

Status:
Resolved
Priority:
Normal
Assigned To:
Category:
API
Target version:
Start date:
03/01/2019
Due date:
Story points:
2.0
Release relationship:
Auto

Description

Preserve all relevant logs in the container request's log collection, even if they span multiple containers.

Instead of just replacing the CR's entire log collection when the container's log is updated:
  • Copy the container's log files into a "container ${uuid}" subdir in the container request's log collection.
  • Leave any existing "container ${uuid}" subdirs alone.
  • Also put a copy of the latest container's logs in the root dir of the container request's log collection. This way, existing scripts continue to work on new logs.

(Aside: This also helps in the case where the container record itself is really what's wanted, since that is included in the container's log collection. There are currently some exceptions -- e.g., a log collection isn't created at all when a container doesn't fit any instance type -- but those could be fixed.)


Subtasks 2 (0 open2 closed)

Task #14894: Review 14870-retry-logsResolvedPeter Amstutz03/04/2019Actions
Task #14908: Review 14870-ruby-sdk-cp-rResolvedLucas Di Pentima03/01/2019Actions

Related issues

Related to Arvados - Feature #14706: [Crunch2] Retain references + permissions to earlier containers when retrying a container requestResolvedActions
Actions #1

Updated by Tom Clegg about 5 years ago

  • Related to Feature #14706: [Crunch2] Retain references + permissions to earlier containers when retrying a container request added
Actions #2

Updated by Tom Morris about 5 years ago

  • Target version changed from To Be Groomed to 2019-02-27 Sprint
  • Story points set to 2.0
Actions #3

Updated by Tom Morris about 5 years ago

  • Target version changed from 2019-02-27 Sprint to Arvados Future Sprints
Actions #4

Updated by Tom Morris about 5 years ago

  • Target version changed from Arvados Future Sprints to 2019-03-13 Sprint
Actions #5

Updated by Peter Amstutz about 5 years ago

  • Assigned To set to Peter Amstutz
Actions #6

Updated by Peter Amstutz about 5 years ago

  • Status changed from New to In Progress
Actions #7

Updated by Tom Morris about 5 years ago

  • Release set to 15
Actions #8

Updated by Peter Amstutz about 5 years ago

14870-ruby-sdk-cp-r @ 338ab239adbc259d5cd070158b4e571925b9f81b

The gist is that the ruby sdk seems to have a long standing bug where you can't copy into "." of an empty collection. It follows a different code path from the case where you are copying into a collection that already has something in it, as a result the existing test case "test_copy_root_contents_across_collections" didn't catch it.

Actions #9

Updated by Lucas Di Pentima about 5 years ago

As previously said on chat, 14870-ruby-sdk-cp-r LGTM. Thanks!

Actions #10

Updated by Peter Amstutz about 5 years ago

14870-retry-logs @ 6a240180171525077bc9e64e903b0122d5d5f1b4

https://ci.curoverse.com/view/Developer/job/developer-run-tests/1097/

  • Logs for each container copied into subdirectory "container log for [uuid]". The most recent logs are also copied into the root of the collection when the container is finalized to minimize breaking existing code.
  • Update tests
I'm open to changing the exact name of the subdirectory. Some other possibilities are
  • just the uuid (no extra text)
  • "log for attempt [uuid]"
  • "failed attempt [uuid]"
Actions #11

Updated by Peter Amstutz about 5 years ago

Updating the Arvados Ruby SDK dependency creates an incidental problem, #14482 tightens up manifest handling but did not update the API server dependency. That's now causing problems:

ERROR: runTest (tests.test_mount.FuseMountTest)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/ci-jenkins/.jenkins-slave/workspace/developer-run-tests-services-fuse/services/fuse/tests/test_mount.py", line 91, in setUp
    self.api.collections().create(body={"manifest_text":cw.manifest_text()}).execute()
  File "/tmp/tmp.HDfRwOWIb9/VENVDIR/local/lib/python2.7/site-packages/googleapiclient/_helpers.py", line 130, in positional_wrapper
    return wrapped(*args, **kwargs)
  File "/tmp/tmp.HDfRwOWIb9/VENVDIR/local/lib/python2.7/site-packages/googleapiclient/http.py", line 840, in execute
    raise HttpError(resp, content, uri=self.uri)
ApiError: <HttpError 422 when requesting https://0.0.0.0:43523/arvados/v1/collections?alt=json returned "Manifest text Manifest invalid for stream 5: invalid file token "4:1:\u0001\\"">
Actions #14

Updated by Lucas Di Pentima about 5 years ago

The changes LGTM. However, there's the pending FUSE issue. Just in case I did a complete test run: https://ci.curoverse.com/job/developer-run-tests/1101/

Actions #17

Updated by Peter Amstutz about 5 years ago

  • Status changed from In Progress to Resolved
Actions

Also available in: Atom PDF