Bug #3825

[Crunch] crunch-job should save (very) large log files without filling up tmp space

Added by Ward Vandewege almost 5 years ago. Updated almost 5 years ago.

Status:
Resolved
Priority:
Normal
Assigned To:
Tim Pierce
Category:
Crunch
Target version:
Start date:
10/02/2014
Due date:
% Done:

100%

Estimated time:
(Total: 0.00 h)
Story points:
1.0

Description

Instead of writing to a temporary file and then calling arv-put, pipe the log directly to an arv-put process.


Subtasks

Task #3913: Test crunch-job log outputResolvedTim Pierce

Task #3912: Rewrite crunch_job::Log to pipe output to arv-putResolvedTim Pierce

Task #4078: Review 3825-crunch-pipe-to-arv-put-finalResolvedTim Pierce

Associated revisions

Revision d6596397
Added by Tim Pierce almost 5 years ago

Merge branch '3825-crunch-pipe-to-arv-put-final'

Closes #3825.

History

#1 Updated by Ward Vandewege almost 5 years ago

  • Description updated (diff)

#2 Updated by Tom Clegg almost 5 years ago

  • Subject changed from [Crunch] fix crunch-dispatch so it does not fill up tmp space when there a job has a (very) large log file to [Crunch] crunch-job should save (very) large log files without filling up tmp space
  • Description updated (diff)
  • Category set to Crunch

#3 Updated by Tom Clegg almost 5 years ago

  • Target version changed from Arvados Future Sprints to 2014-10-08 sprint

#4 Updated by Tim Pierce almost 5 years ago

  • Assigned To set to Tim Pierce

#5 Updated by Tim Pierce almost 5 years ago

  • Status changed from New to In Progress

#6 Updated by Tom Clegg almost 5 years ago

Reviewing 3825-crunch-pipe-to-arv-put at c6a6231

  • I think it would read better if output_log_* were renamed to log_writer_*
    • The word "output" usually means "output of a task or job" around here, and that's already a bit overloaded and confusing.
    • Perhaps write_output_log should rename to something slightly different than write_log_writer, though. Maybe write_to_log_writer or log_writer_send?
    • I do like the idea of using the whole substring log_writer (née output_log) in all of the related functions. It makes a really clear distinction between "this log pipe" and "some mysterious perl feature" and "some other logging-related thing we do here" etc.
I think a semicolon, although not strictly necessary, would be better form here:
  • +sub output_log_is_active() {
    +  return $log_pipe_pid
    +}
    

Other than that, LGTM!

#7 Updated by Tim Pierce almost 5 years ago

  • Status changed from In Progress to Resolved
  • % Done changed from 67 to 100

Applied in changeset arvados|commit:d65963976174bd9c94fdb0b91eeb8a281f01e7b3.

#8 Updated by Ward Vandewege almost 5 years ago

  • Target version changed from 2014-10-08 sprint to 2014-10-29 sprint

#9 Updated by Ward Vandewege almost 5 years ago

  • Target version changed from 2014-10-29 sprint to 2014-10-08 sprint

Also available in: Atom PDF