Bug #21611
closedpreemption notices do not appear in crunch-run.txt
Description
I've looked at a number of containers now that have been preempted and none of them have crunch-run.txt updated to say that it received a preemption notice even though it is supposed to.
Updated by Peter Amstutz 9 months ago
- Target version changed from Future to Development 2024-04-24 sprint
Updated by Peter Amstutz 9 months ago
- Target version changed from Development 2024-04-24 sprint to Development 2024-05-08 sprint
Updated by Peter Amstutz 9 months ago
- Target version changed from Development 2024-05-08 sprint to Development 2024-04-24 sprint
Updated by Peter Amstutz 9 months ago
- Target version changed from Development 2024-04-24 sprint to Development 2024-05-08 sprint
Updated by Peter Amstutz 8 months ago
- Description updated (diff)
- Subject changed from crunch-run updates copy of container.json in log collection when a container ends and/or runtime_status is updated to preemption notices do not appear in crunch-run.txt
- Tracker changed from Feature to Bug
Updated by Tom Clegg 8 months ago
- Status changed from New to In Progress
- "write log entry" just writes the message to the throttled-logging buffer, not yet to the log collection
- "save log collection" saves the log collection and resets the auto-flush timer, minimizing the chance auto-flush will happen before the preempted instance shuts down
It's possible something else is going on too, but either way, we should rearrange the logging pipeline so the log collection gets updated immediately instead of after the "group logs into chunks" step. If nothing else, that will reduce latency for showing logs in workbench.
Updated by Tom Clegg 8 months ago
21611-log-chunk-delay @ 8a5db7b48c1fb11423110490267fea17161f7674 -- developer-run-tests: #4207 (flaky fuse test, see #21660)
21611-log-chunk-delay @ 8a5db7b48c1fb11423110490267fea17161f7674 -- developer-run-tests: #4208 (Something is already running on port 38402.)
21611-log-chunk-delay @ 8a5db7b48c1fb11423110490267fea17161f7674 -- developer-run-tests: #4209
Removes all the "buffer logs into chunks and send them to POST /arvados/v1/logs" code that was preventing the existing "flush logs immediately" code from working as intended (see #note-8 above).
- All agreed upon points are implemented / addressed.
- ✅
- Anything not implemented (discovered or discussed during work) has a follow-up story.
- N/A
- Code is tested and passing, both automated and manual, what manual testing was done is described
- ✅ updated preemption-warning test case to check that the container record is promptly updated with a log PDH that mentions the preemption warning message
- Documentation has been updated.
- N/A
- Behaves appropriately at the intended scale (describe intended scale).
- N/A
- Considered backwards and forwards compatibility issues between client and server.
- N/A
- Follows our coding standards and GUI style guidelines.
- ✅
This will also have the side effect of reducing logging latency in workbench. Previously LogBytesPerEvent/LogSecondsBetweenEvents (default 4K/5s) were introducing a store/wait/forward delay even when LimitLogBytesPerJob was zero.
Updated by Peter Amstutz 8 months ago
- Target version changed from Development 2024-05-08 sprint to Development 2024-05-22 sprint
Updated by Brett Smith 7 months ago
Tom Clegg wrote in #note-9:
21611-log-chunk-delay @ 8a5db7b48c1fb11423110490267fea17161f7674 -- developer-run-tests: #4209
Removes all the "buffer logs into chunks and send them to POST /arvados/v1/logs" code that was preventing the existing "flush logs immediately" code from working as intended (see #note-8 above).
LGTM. My one nit is I think the configuration keys in the upgrade notes would look better in monospace
. (I wish we had a documentation style guide to help us keep consistent on stuff like this.)
Updated by Tom Clegg 7 months ago
- Status changed from In Progress to Resolved
Applied in changeset arvados|84650a094921bb149ffb31a95bee9875dfd1c1df.
Updated by Peter Amstutz 7 months ago
- Related to Bug #21833: Regression: stdout and stderr in log collection missing line-by-line timestamps added
Updated by Tom Clegg 7 months ago
- Related to Bug #21834: Restore timestamps in container stdout.txt and stderr.txt added