Story #13048

Updated by Tom Clegg almost 3 years ago

Functionally, source:services/crunch-run is doing a reasonable job. However, the way it's implemented makes it difficult to make some of the changes we want.

Relevant issues
* #10181 save logs to keep periodically while a container is running (not just after it exits & saves staged outputs)
* #13005 timestamps are sometimes wrong/confusing because of throttle behavior
* source:services/crunch-run and source:sdk/go/crunchrunner should drop their custom manifest-writing code, now that we have generalized write support in #12483
* The implementation is more complicated / harder to follow than it should be, given the low complexity of the problem it's solving

Proposed improvements
* Refactor the various functional aspects (add timestamps, throttle, write to apiserver) into modular parts that communicate through simple interfaces like io.Writer.
* Use io.MultiWriter from stdlib, instead of custom routing built into the processing modules.
* Use @(*arvados.Collection)FileSystem()@ (*arvados.Collection)FileSystem() to open/write log files (and staged outputs? → delete @upload*.go@) upload*.go)
* Drop the pretense of splitting long lines (apparently this isn't needed; MaxLogLine seems to have been disconnected 2 years ago in commit:b719ef57055ba2fd06c7a1377cc0d47ee5df935e)