Bug #21611
openpreemption notices do not appear in crunch-run.txt
Description
I've looked at a number of containers now that have been preempted and none of them have crunch-run.txt updated to say that it received a preemption notice even though it is supposed to.
Updated by Peter Amstutz about 1 month ago
- Target version changed from Future to Development 2024-04-24 sprint
Updated by Peter Amstutz about 1 month ago
- Target version changed from Development 2024-04-24 sprint to Development 2024-05-08 sprint
Updated by Peter Amstutz about 1 month ago
- Target version changed from Development 2024-05-08 sprint to Development 2024-04-24 sprint
Updated by Peter Amstutz about 1 month ago
- Target version changed from Development 2024-04-24 sprint to Development 2024-05-08 sprint
Updated by Peter Amstutz 18 days ago
- Description updated (diff)
- Subject changed from crunch-run updates copy of container.json in log collection when a container ends and/or runtime_status is updated to preemption notices do not appear in crunch-run.txt
- Tracker changed from Feature to Bug
Updated by Tom Clegg 18 days ago
- Status changed from New to In Progress
- "write log entry" just writes the message to the throttled-logging buffer, not yet to the log collection
- "save log collection" saves the log collection and resets the auto-flush timer, minimizing the chance auto-flush will happen before the preempted instance shuts down
It's possible something else is going on too, but either way, we should rearrange the logging pipeline so the log collection gets updated immediately instead of after the "group logs into chunks" step. If nothing else, that will reduce latency for showing logs in workbench.
Updated by Tom Clegg 9 days ago
21611-log-chunk-delay @ 8a5db7b48c1fb11423110490267fea17161f7674 -- developer-run-tests: #4207 (flaky fuse test, see #21660)
21611-log-chunk-delay @ 8a5db7b48c1fb11423110490267fea17161f7674 -- developer-run-tests: #4208 (Something is already running on port 38402.)
21611-log-chunk-delay @ 8a5db7b48c1fb11423110490267fea17161f7674 -- developer-run-tests: #4209
Removes all the "buffer logs into chunks and send them to POST /arvados/v1/logs" code that was preventing the existing "flush logs immediately" code from working as intended (see #note-8 above).
- All agreed upon points are implemented / addressed.
- ✅
- Anything not implemented (discovered or discussed during work) has a follow-up story.
- N/A
- Code is tested and passing, both automated and manual, what manual testing was done is described
- ✅ updated preemption-warning test case to check that the container record is promptly updated with a log PDH that mentions the preemption warning message
- Documentation has been updated.
- N/A
- Behaves appropriately at the intended scale (describe intended scale).
- N/A
- Considered backwards and forwards compatibility issues between client and server.
- N/A
- Follows our coding standards and GUI style guidelines.
- ✅
This will also have the side effect of reducing logging latency in workbench. Previously LogBytesPerEvent/LogSecondsBetweenEvents (default 4K/5s) were introducing a store/wait/forward delay even when LimitLogBytesPerJob was zero.
Updated by Peter Amstutz 4 days ago
- Target version changed from Development 2024-05-08 sprint to Development 2024-05-22 sprint