Bug #10120

[Crunch] crunch-dispatch log throttling should not apply to its own stderr

Added by Tom Clegg over 4 years ago. Updated over 3 years ago.

Assigned To:
Target version:
Start date:
Due date:
% Done:


Estimated time:
Story points:


The premise of log throttling is that an unruly job (producing too much log) shouldn't cause crunch-dispatch or postgres to stop doing other work properly. See #3769.

In the current implementation, when logs are being throttled, the logs also stop appearing in crunch-dispatch's own stderr. Therefore, when logs are being throttled, even a sysadmin can't see what a job is doing. This is especially annoying when the "maximum bytes per job" limit is reached: there is no feedback available anywhere until the job finishes.

Throttling crunch-dispatch's stderr seems undesirable:
  • these logs are already assumed to be processed and rotated efficiently by some external process like runit's svlogd, so the benefit is small
  • the sysadmin's ability to see logs during busy times is important

The only real benefit of throttling logs seems to be avoiding the cost of splitting chunks of stderr into lines and prepending the job uuid to each line as needed.

Perhaps we can make the processing more efficient without losing the logs entirely -- e.g., skip the "prepend job uuid to each line" part, and dump many lines at once to stderr when throttled?

Related issues

Related to Arvados - Bug #3769: [API] In crunch-dispatch, throttle by bytes_per_minute or _node_minuteResolved10/01/2014


#1 Updated by Tom Clegg over 4 years ago


(doesn't handle the case where the per-job throttle is closed)

#2 Updated by Tom Morris over 3 years ago

  • Target version set to Arvados Future Sprints

Also available in: Atom PDF