Project

General

Profile

Actions

Bug #7123

closed

[Crunch] Should not save any log record when log writing fails

Added by Brett Smith over 9 years ago. Updated about 9 years ago.

Status:
Resolved
Priority:
Normal
Assigned To:
Category:
Crunch
Target version:
Story points:
0.5

Description

We've seen a few instances recently where crunch-job can't save the job's log because writing to Keep fails. At least in some cases, crunch-job ends up updating the job record with log=[the empty collection]. For an example, see su92l-8i9sb-bdaoy3zppoxju1i.

When this happens, it would be better if crunch-job did not update the log field, or expressly set it null. Other tools, like the API server's log cleaner rake task and Workbench, recognize a null log as a sign that there's no log in Keep, where they don't recognize the empty collection content address.


Subtasks 1 (0 open1 closed)

Task #7741: Review branch 7123-crunch-no-record-log-failure-wipResolvedBrett Smith11/09/2015Actions

Related issues 1 (0 open1 closed)

Related to Arvados - Bug #4748: [API] Explain why there are 21 jobs with log = ""ClosedActions
Actions #1

Updated by Brett Smith about 9 years ago

  • Target version changed from Arvados Future Sprints to 2015-11-11 sprint
  • Status changed from New to In Progress
  • Assigned To set to Brett Smith
Actions #2

Updated by Brett Smith about 9 years ago

  • Target version changed from 2015-11-11 sprint to 2015-12-02 sprint
Actions #3

Updated by Peter Amstutz about 9 years ago

Let me see if I understand:

  • $logger_failed = -1; means that nothing was read at all, and we timed out.
  • $logger_failed = -2; means that log_writer_read_output read some characters, and then timed out.
  • if $? is non zero we set $logger_failed = $?, it is okay if $logger_failed is already set because ultimately we only care if $logger_failed is zero or not.
  • We only set $arv_put_output if there were no errors, otherwise we leave it null, which means workbench & other tools will access the logs in the API server log table.
  • Don't update the "log" field unless log_writer_finish returned a valid string.

Looks good to me.

Actions #4

Updated by Brett Smith about 9 years ago

You understand it all correctly.

Peter Amstutz wrote:

  • if $? is non zero we set $logger_failed = $?, it is okay if $logger_failed is already set because ultimately we only care if $logger_failed is zero or not.

This part does seem a little overly confusing, doesn't it? Made it $logger_failed ||= $? to preserve the original error, which is probably more helpful. Thanks.

Actions #5

Updated by Brett Smith about 9 years ago

  • Status changed from In Progress to Resolved
  • % Done changed from 0 to 100

Applied in changeset arvados|commit:6de60c7db7a98405ef7ae4ac5eb20498f095416c.

Actions

Also available in: Atom PDF