Feature #15319

[API] make rails stacktraces more useful, and correlate them with a requestID, ideally in a json struct. See if we can expose that in the rails logs that way, or via journal, or something else?

Added by Ward Vandewege 3 months ago. Updated about 1 month ago.

Status:
Resolved
Priority:
Normal
Assigned To:
Category:
-
Target version:
Start date:
07/12/2019
Due date:
% Done:

100%

Estimated time:
(Total: 0.00 h)
Story points:
1.0

Subtasks

Task #15390: Review 15319-api-useful-stacktracesResolvedPeter Amstutz

Associated revisions

Revision 99240b3e
Added by Lucas Di Pentima about 1 month ago

Merge branch '15319-api-useful-stacktraces'
Closes #15319

Arvados-DCO-1.1-Signed-off-by: Lucas Di Pentima <>

History

#1 Updated by Lucas Di Pentima 2 months ago

From what I'm reading on lograge's@ github page (https://github.com/roidrage/lograge - FAQ section), we can include exceptions on the custom_options configuration payload. We just have to make sure to convert the stack trace to a single line.

#2 Updated by Lucas Di Pentima 2 months ago

Another mention on this topic: https://github.com/roidrage/lograge/issues/190

#3 Updated by Tom Morris 2 months ago

  • Story points set to 1.0
  • Target version changed from To Be Groomed to Arvados Future Sprints

#4 Updated by Tom Morris 2 months ago

  • Target version changed from Arvados Future Sprints to 2019-07-03 Sprint

#5 Updated by Lucas Di Pentima 2 months ago

  • Assigned To set to Lucas Di Pentima

#6 Updated by Lucas Di Pentima about 2 months ago

  • Target version changed from 2019-07-03 Sprint to 2019-07-17 Sprint

#7 Updated by Lucas Di Pentima about 2 months ago

  • Status changed from New to In Progress

#8 Updated by Lucas Di Pentima about 1 month ago

Updates at 25d42fc - branch 15319-api-useful-stacktraces
Test run: https://ci.curoverse.com/job/developer-run-tests/1394/ (testing the commit previous to merging a broken master)

At first and from the grooming investigation, this story seemed pretty trivial as the lograge documentation states that including the exeptions on the log is a supported feature, but in our particular case it doesn't work because we catch all exceptions with a rescue_from(Exception, ..., :with => :render_error) on application_controller.rb, and the exception handlers cannot propagate the exceptions being handled by re-raising them, so lograge doesn't receive them.

The solution found was assigning the exception information to Thread.current and consuming it at lograge's custom formatter.

The other tricky part was testing by using stubs and mocha expectations.

#9 Updated by Peter Amstutz about 1 month ago

Lucas Di Pentima wrote:

Updates at 25d42fc - branch 15319-api-useful-stacktraces
Test run: https://ci.curoverse.com/job/developer-run-tests/1394/ (testing the commit previous to merging a broken master)

At first and from the grooming investigation, this story seemed pretty trivial as the lograge documentation states that including the exeptions on the log is a supported feature, but in our particular case it doesn't work because we catch all exceptions with a rescue_from(Exception, ..., :with => :render_error) on application_controller.rb, and the exception handlers cannot propagate the exceptions being handled by re-raising them, so lograge doesn't receive them.

The solution found was assigning the exception information to Thread.current and consuming it at lograge's custom formatter.

The other tricky part was testing by using stubs and mocha expectations.

The test build had workbench errors so I submitted another build: https://ci.curoverse.com/view/Developer/job/developer-run-tests/1396/

The solution looks good. Could we add a documentation page in the admin section about 'how to diagnose errors' that mentions these commands.

To get a specific request:

grep req-frdyrcgdh4rau1ajiq5q production.log | jq .

To get the backtrace from a failed request:

grep req-frdyrcgdh4rau1ajiq5q production.log | jq -r .exception_backtrace

#10 Updated by Lucas Di Pentima about 1 month ago

Please take into account that the test run 1396 is using a commit that includes current master, that's why the api server tests failed.

#11 Updated by Lucas Di Pentima about 1 month ago

Updates at 783b9cc83

Adds documentation describing #15318 and #15319 features with an example.

#12 Updated by Peter Amstutz about 1 month ago

Lucas Di Pentima wrote:

Updates at 783b9cc83

Adds documentation describing #15318 and #15319 features with an example.

This is great!

My only other suggestion is to add links to the troubleshooting page from the install documentation, because it might be hard to find where it is now. Otherwise LGTM, please merge.

#13 Updated by Lucas Di Pentima about 1 month ago

  • % Done changed from 0 to 100
  • Status changed from In Progress to Resolved

Also available in: Atom PDF