Troubleshooting aids » History » Version 2
Tom Clegg, 04/03/2024 08:08 PM
1 | 1 | Tom Clegg | h1. Troubleshooting aids |
---|---|---|---|
2 | |||
3 | Troubleshoot usage problems: |
||
4 | * Improve error messages (e.g., clients should not crash and dump stack when a server is slow/unresponsive) |
||
5 | |||
6 | Troubleshoot compute nodes/images: |
||
7 | * {{issue(21581)}} |
||
8 | * {{issue(21424)}} |
||
9 | |||
10 | Troubleshoot arvados system services: |
||
11 | * Save snapshot of internals (goroutines / memory profile) of specified system service(s) to a collection, and provide instructions for viewing |
||
12 | * Save last N minutes of logs from all arvados services running on this host |
||
13 | 2 | Tom Clegg | * Turn on debug mode temporarily, without restarting services |
14 | 1 | Tom Clegg | |
15 | Expose config/scaling issues: |
||
16 | * Scan metrics for recent "near/at capacity" signals |
||
17 | * Probe for proper nginx/proxy config (e.g., max request body size) |