Bug #7123

[Crunch] Should not save any log record when log writing fails

Added by Brett Smith about 4 years ago. Updated almost 4 years ago.

Status:
Resolved
Priority:
Normal
Assigned To:
Brett Smith
Category:
Crunch
Target version:
Start date:
11/09/2015
Due date:
% Done:

100%

Estimated time:
(Total: 0.00 h)
Story points:
0.5

Description

We've seen a few instances recently where crunch-job can't save the job's log because writing to Keep fails. At least in some cases, crunch-job ends up updating the job record with log=[the empty collection]. For an example, see su92l-8i9sb-bdaoy3zppoxju1i.

When this happens, it would be better if crunch-job did not update the log field, or expressly set it null. Other tools, like the API server's log cleaner rake task and Workbench, recognize a null log as a sign that there's no log in Keep, where they don't recognize the empty collection content address.


Subtasks

Task #7741: Review branch 7123-crunch-no-record-log-failure-wipResolvedBrett Smith


Related issues

Related to Arvados - Bug #4748: [API] Explain why there are 21 jobs with log = ""Closed

Associated revisions

Revision 6de60c7d
Added by Brett Smith almost 4 years ago

Merge branch '7123-crunch-no-record-log-failure-wip'

Closes #7123, #7741.

History

#1 Updated by Brett Smith almost 4 years ago

  • Status changed from New to In Progress
  • Assigned To set to Brett Smith
  • Target version changed from Arvados Future Sprints to 2015-11-11 sprint

#2 Updated by Brett Smith almost 4 years ago

  • Target version changed from 2015-11-11 sprint to 2015-12-02 sprint

#3 Updated by Peter Amstutz almost 4 years ago

Let me see if I understand:

  • $logger_failed = -1; means that nothing was read at all, and we timed out.
  • $logger_failed = -2; means that log_writer_read_output read some characters, and then timed out.
  • if $? is non zero we set $logger_failed = $?, it is okay if $logger_failed is already set because ultimately we only care if $logger_failed is zero or not.
  • We only set $arv_put_output if there were no errors, otherwise we leave it null, which means workbench & other tools will access the logs in the API server log table.
  • Don't update the "log" field unless log_writer_finish returned a valid string.

Looks good to me.

#4 Updated by Brett Smith almost 4 years ago

You understand it all correctly.

Peter Amstutz wrote:

  • if $? is non zero we set $logger_failed = $?, it is okay if $logger_failed is already set because ultimately we only care if $logger_failed is zero or not.

This part does seem a little overly confusing, doesn't it? Made it $logger_failed ||= $? to preserve the original error, which is probably more helpful. Thanks.

#5 Updated by Brett Smith almost 4 years ago

  • Status changed from In Progress to Resolved
  • % Done changed from 0 to 100

Applied in changeset arvados|commit:6de60c7db7a98405ef7ae4ac5eb20498f095416c.

Also available in: Atom PDF