Bug #5043

[Crunch] accept arbitrarily long stderr lines in crunchstat

Added by Tim Pierce almost 6 years ago. Updated over 5 years ago.

Status:
Resolved
Priority:
Normal
Assigned To:
Category:
Crunch
Target version:
Start date:
01/21/2015
Due date:
% Done:

100%

Estimated time:
(Total: 0.00 h)
Story points:
0.5

Description

Crunchstat uses a bufio.Scanner to read stderr lines, and as a consequence cannot process input lines longer than bufio.MaxScanTokenSize (64K). (See #4889)

Using a bufio.Reader with ReadString() or something similar would make it more robust to unexpectedly long stderr output, without killing the child task.


Subtasks

Task #5273: Add testResolvedTom Clegg

Task #5274: Avoid parsing tokens in child's stderrResolvedTom Clegg

Task #5360: Review 5043-crunchstat-long-linesResolvedPeter Amstutz

Associated revisions

Revision a04ea95e
Added by Tom Clegg over 5 years ago

Merge branch '5043-crunchstat-long-lines' closes #5043

History

#1 Updated by Tom Clegg almost 6 years ago

  • Story points set to 0.5

#2 Updated by Tom Clegg almost 6 years ago

  • Target version changed from Arvados Future Sprints to 2015-03-11 sprint

#3 Updated by Tom Clegg almost 6 years ago

  • Assigned To set to Tom Clegg

#4 Updated by Tom Clegg over 5 years ago

  • Status changed from New to In Progress

#5 Updated by Tom Clegg over 5 years ago

5043-crunchstat-long-lines at 508bcab includes

Accept arbitrarily long lines on stderr. Split long stderr lines into <= 16KiB segments, emitting "firstpart[...]" then "[...]nextpart[...]" then "[...]lastpart". Splitting provides a few benefits:
  1. crunchstat's caller (and, typically, Workbench) can see some stderr even though a newline might not arrive for a long time.
  2. programs trying to display the resulting log are less likely to behave badly (e.g., less(1) gets very slow when displaying long lines)
  3. crunchstat doesn't consume arbitrary amounts of RAM trying to buffer long lines.

Also: Get rid of the "log channel" and use Go's own logging library instead: it guarantees goroutine safety and allows you to specify a prefix ("crunchstat: "), which were the two things our logChan code was doing.

#6 Updated by Peter Amstutz over 5 years ago

I took some time to beat on this a little bit. Tested this way:

./crunchstat -cgroup-root=/sys/fs/cgroup -poll=1 bash -c "cat ~/ff4a131626aa0883c715d3c5abf394a6 1>&2" 2> out

(ff4a131626aa0883c715d3c5abf394a6 is 28 MiB of manifest on a single line I had lying around from something else..)

Works great!

Also: Get rid of the "log channel" and use Go's own logging library instead: it guarantees goroutine safety and allows you to specify a prefix ("crunchstat: "), which were the two things our logChan code was doing.

You're creating separate "statLog" and "childLog" Loggers that both write to Stderr, and (from looking at the actual log.go code) each one has its own mutex, which means writes to "statLog" are serialized, but this has no effect on writes to "childLog". In practice this doesn't seem to be an issue, but I think we're just getting lucky that Write() on the underlying stderr stream is effectively atomic for small writes.

Just one thing puzzled me in the tests:

t.Fatalf("sent %d bytes, got %d different bytes", len(sentBytes)+1, len(receivedBytes))

Why does it report len(sentBytes)+1 when the test is bytes.Compare(receivedBytes, sentBytes) ?

#7 Updated by Tom Clegg over 5 years ago

Peter Amstutz wrote:

Also: Get rid of the "log channel" and use Go's own logging library instead: it guarantees goroutine safety and allows you to specify a prefix ("crunchstat: "), which were the two things our logChan code was doing.

You're creating separate "statLog" and "childLog" Loggers that both write to Stderr, and (from looking at the actual log.go code) each one has its own mutex, which means writes to "statLog" are serialized, but this has no effect on writes to "childLog". In practice this doesn't seem to be an issue, but I think we're just getting lucky that Write() on the underlying stderr stream is effectively atomic for small writes.

It's not just luck: the Go log library also guarantees "each logging operation makes a single call to the Writer's Write method." The mutex just controls access to the logger's output buffer; os.Stderr.Write() itself is already safe to use from multiple goroutines.

You can exercise this with something like

./crunchstat -cgroup-root=/sys/fs/cgroup -poll=1 sh -c 'yes c------------------------------------------------------------------------ >&2' 2>&1 |egrep -v ^c

Why does it report len(sentBytes)+1 when the test is bytes.Compare(receivedBytes, sentBytes) ?

Oops, fixed @ f1589a6. (This was left over from an uglier version in which sentBytes didn't end with a newline.)

#8 Updated by Peter Amstutz over 5 years ago

Tom Clegg wrote:

It's not just luck: the Go log library also guarantees "each logging operation makes a single call to the Writer's Write method." The mutex just controls access to the logger's output buffer; os.Stderr.Write() itself is already safe to use from multiple goroutines.

Citation? I believe you, but I'm curious where it actually says that in the Go documentation.

In my own test I didn't see any interference, so it does seem to be working in practice.

LGTM.

#9 Updated by Anonymous over 5 years ago

  • Status changed from In Progress to Resolved

Applied in changeset arvados|commit:a04ea95e79c60ed2a54eaec5b5c2e235fe39ef9a.

Also available in: Atom PDF