Actions
Troubleshooting aids¶
Troubleshoot usage problems:- Improve error messages (e.g., clients should not crash and dump stack when a server is slow/unresponsive)
- Idea #21581: Crunch saves compute node journals to collections readable only by administrators
- Idea #21424: Way to run a diagnostic container that captures all system logs, not just Crunch's
- Save snapshot of internals (goroutines / memory profile) of specified system service(s) to a collection, and provide instructions for viewing
- Save last N minutes of logs from all arvados services running on this host
- Turn on debug mode temporarily, without restarting services
- Option to send railsapi logs to journal or stderr
- Option to send go service logs to journal directly (structured, including loglevel)
- Scan metrics for recent "near/at capacity" signals
- Probe for proper nginx/proxy config (e.g., max request body size)
Updated by Tom Clegg 8 months ago · 5 revisions