Project

General

Profile

Troubleshooting aids » History » Version 5

Tom Clegg, 04/03/2024 08:23 PM

1 1 Tom Clegg
h1. Troubleshooting aids
2
3
Troubleshoot usage problems:
4
* Improve error messages (e.g., clients should not crash and dump stack when a server is slow/unresponsive)
5
6
Troubleshoot compute nodes/images:
7
* {{issue(21581)}}
8
* {{issue(21424)}}
9
10
Troubleshoot arvados system services:
11
* Save snapshot of internals (goroutines / memory profile) of specified system service(s) to a collection, and provide instructions for viewing
12
* Save last N minutes of logs from all arvados services running on this host
13 2 Tom Clegg
* Turn on debug mode temporarily, without restarting services
14 4 Tom Clegg
* Option to send railsapi logs to journal or stderr
15 5 Tom Clegg
* Option to send go service logs to journal directly (structured, including loglevel)
16 1 Tom Clegg
17
Expose config/scaling issues:
18
* Scan metrics for recent "near/at capacity" signals
19
* Probe for proper nginx/proxy config (e.g., max request body size)