Project

General

Profile

Actions

Bug #15521

closed

[keepstore] error reporting improvements

Added by Peter Amstutz over 3 years ago. Updated about 3 years ago.

Status:
Resolved
Priority:
Normal
Assigned To:
Category:
-
Target version:
Start date:
10/31/2019
Due date:
% Done:

100%

Estimated time:
(Total: 0.00 h)
Story points:
2.0
Release relationship:
Auto

Description

From ops bug #15520

Keepstore logging improvements:

  1. Keepstore PutBlock() calls log.Printf, this line of code is untouched from 2014 (!) it is being logged in JSON format but lacks useful context like the request id.
  2. The error that is sent to the client is not logged at all.
  3. The log doesn't say anything about where the block is being fetched from -- which volume, bucket, remote cluster, anything
  4. the error that reaches the user needs to make it clear that the problem was in fetching a remote block; requires some combination of improving server and client error messages

Subtasks 1 (0 open1 closed)

Task #15750: Review 15521-keepstore-loggingResolvedTom Clegg10/31/2019

Actions

Related issues

Related to Arvados - Bug #15606: [keep-web] logging doesn't include error messagesResolvedTom Clegg10/29/2019

Actions
Related to Arvados - Bug #15713: [Controller] Internal error not loggedResolvedTom Clegg10/24/2019

Actions
Actions #1

Updated by Peter Amstutz over 3 years ago

  • Description updated (diff)
Actions #2

Updated by Tom Morris over 3 years ago

  • Target version changed from 2019-08-14 Sprint to Arvados Future Sprints
Actions #3

Updated by Peter Amstutz over 3 years ago

  • Subject changed from federation error reporting improvements to [keepstore] error reporting improvements
Actions #4

Updated by Tom Morris over 3 years ago

  • Story points set to 2.0
Actions #5

Updated by Peter Amstutz over 3 years ago

  • Related to Bug #15606: [keep-web] logging doesn't include error messages added
Actions #6

Updated by Tom Clegg over 3 years ago

  • Assigned To set to Tom Clegg
  • Target version changed from Arvados Future Sprints to 2019-11-06 Sprint
Actions #7

Updated by Tom Clegg about 3 years ago

  • Status changed from New to In Progress
Actions #8

Updated by Tom Clegg about 3 years ago

  • Related to Bug #15713: [Controller] Internal error not logged added
Actions #9

Updated by Tom Clegg about 3 years ago

  1. Keepstore PutBlock() calls log.Printf, this line of code is untouched from 2014 (!) it is being logged in JSON format but lacks useful context like the request id.

Updated "MD5 checksum %s did not match request" logs (and many other logs in keepstore) to use ctxlog so they include request id, loglevel, etc.

Fixed an unreported error: PutBlock tries volmgr.NextWritable(), and if that fails, it tries all writable volumes in sequence. The error from that first failure wasn't being logged.

  1. The error that is sent to the client is not logged at all.

Already fixed in #15713.

  1. The log doesn't say anything about where the block is being fetched from -- which volume, bucket, remote cluster, anything

If we get bad data from a remote service, we get two log entries:

msg="%s: MD5 checksum %s did not match request"

respStatusCode=502 respBody="checksum mismatch in remote response" (respBody added in #15713)

  1. the error that reaches the user needs to make it clear that the problem was in fetching a remote block; requires some combination of improving server and client error messages

The server is sending "checksum mismatch in remote response".

15521-keepstore-logging @ 62d28600cbfc31f8e72c61e4519ff198cb66a02a -- https://ci.curoverse.com/view/Developer/job/developer-run-tests/1616/

Actions #11

Updated by Lucas Di Pentima about 3 years ago

This LGTM, thanks.

Actions #12

Updated by Tom Clegg about 3 years ago

  • Status changed from In Progress to Resolved
Actions #13

Updated by Peter Amstutz about 3 years ago

  • Release set to 22
Actions

Also available in: Atom PDF