Project

General

Profile

Actions

Idea #13048

open

Refactor crunch2 logging

Added by Tom Clegg almost 7 years ago. Updated 9 months ago.

Status:
New
Priority:
Normal
Assigned To:
-
Category:
-
Target version:
Start date:
Due date:
Story points:
2.0
Release:
Release relationship:
Auto

Description

Functionally, source:services/crunch-run is doing a reasonable job. However, the way it's implemented makes it difficult to make some of the changes we want.

Relevant issues
  • #10181 save logs to keep periodically while a container is running (not just after it exits & saves staged outputs)
  • #13005 timestamps are sometimes wrong/confusing because of throttle behavior
  • #13100 source:services/crunch-run and source:sdk/go/crunchrunner should drop their custom manifest-writing code, now that we have generalized write support in #12483
  • The implementation is more complicated / harder to follow than it should be, given the low complexity of the problem it's solving
Proposed improvements
  • Refactor the various functional aspects (add timestamps, throttle, write to apiserver) into modular parts that communicate through simple interfaces like io.Writer.
  • Use io.MultiWriter from stdlib, instead of custom routing built into the processing modules.
  • Use (*arvados.Collection)FileSystem() to open/write log files (and staged outputs? → delete upload*.go)
  • Drop the pretense of splitting long lines (apparently this isn't needed; MaxLogLine seems to have been disconnected 2 years ago in b719ef57055ba2fd06c7a1377cc0d47ee5df935e)

Related issues

Related to Arvados - Feature #10181: Crunch job output logging improvement storiesResolvedTom Clegg02/16/2017Actions
Related to Arvados - Bug #13005: [Crunch2] All stdout gets the same timestamp and other logging problemsNewActions
Related to Arvados - Bug #13100: [crunch-run] Replace custom manifest-writing code with collectionFSResolvedTom Clegg03/15/2018Actions
Actions

Also available in: Atom PDF