Bug #7232

[Keep] keepstore should notify sysadmin about error conditions (trying harder than just log.Print())

Added by Tom Clegg almost 4 years ago. Updated almost 4 years ago.

Status:
New
Priority:
Normal
Assigned To:
-
Category:
Keep
Target version:
Start date:
09/07/2015
Due date:
% Done:

0%

Estimated time:
Story points:
-

Description

Certain error conditions in Keepstore deserve sysadmin attention: either they aren't expected to happen and should be investigated, or they indicate a system failure that requires human intervention.
  • Block collision (client tried to write data A with hash X, but data B with hash X is already stored here)
  • A storage volume changed state from non-full to full
  • The only remaining non-full storage volume changed state to full (this is noticeably worse than the above)
  • Block corruption that could be a partial-write error (e.g., GET but block on disk is shorter than expected, or PUT but block on disk is a prefix of the client-provided data)
  • Block corruption that is definitely not a partial-write error (e.g., GET but block on disk is not shorter than expected, or PUT but block on disk is not a prefix of the client-provided data). (This makes it seem likely an underlying storage device is failing.)

History

#1 Updated by Brett Smith almost 4 years ago

  • Target version set to Arvados Future Sprints

Also available in: Atom PDF