Troubleshooting aids » History » Revision 2
Revision 1 (Tom Clegg, 04/03/2024 07:40 PM) → Revision 2/5 (Tom Clegg, 04/03/2024 08:08 PM)
h1. Troubleshooting aids
Troubleshoot usage problems:
* Improve error messages (e.g., clients should not crash and dump stack when a server is slow/unresponsive)
Troubleshoot compute nodes/images:
* {{issue(21581)}}
* {{issue(21424)}}
Troubleshoot arvados system services:
* Save snapshot of internals (goroutines / memory profile) of specified system service(s) to a collection, and provide instructions for viewing
* Save last N minutes of logs from all arvados services running on this host
* Turn on debug mode temporarily, without restarting services
Expose config/scaling issues:
* Scan metrics for recent "near/at capacity" signals
* Probe for proper nginx/proxy config (e.g., max request body size)