Bug #4266
closed[Crunch] crunch-dispatch should process logs efficiently, particularly while throwing away excessive log data.
Description
29768 root 20 0 432m 241m 6656 R 74.8 1.6 946:59.33 ruby
16756 crunch 20 0 1447m 1.2g 4472 S 0.3 8.4 0:34.07 arv-put
Updated by Tom Clegg about 10 years ago
- Description updated (diff)
- Category set to Crunch
- Assigned To set to Tom Clegg
- Possible http://serverfault.com/questions/452833/how-to-resolve-high-cpu-excessive-stat-etc-localtime-and-clock-gettimeclo
- Either way, "time1 - time0 > n_sec" could be significantly slower than precomputing
time2=time0+n_sec
and doing "time1 > time2" - crunch-dispatch does a lot of string manipulation per line during throttled periods
- We seem to have distinct limiting mechanisms for "number of rows stored in the logs table" and "number of lines per minute".
Updated by Tom Clegg about 10 years ago
4266-log-performance @ 2e76e6a
Laptop tests show crunch-dispatch.rb accepting ~200MiB/s using 80% CPU while throttle is closed, while yes(1)
uses 20% CPU sending 100-byte lines. Much better than 100% and 3% respectively.
Updated by Brett Smith about 10 years ago
Reviewing 2e76e6a. I mostly have questions.
- There's a lot of code cleanup here, which provides little improvements like avoiding duplicate method calls and conditional checks, but it looks like most of the performance improvement comes from increasing the maximum read buffer size. Is that about right? (Thank you for commenting that so clearly.)
- Why is this branch removing
crunch_limit_log_events_per_job
? - As a suggestion in the spirit of your code cleanup, the loop in
read_pipes
that reads lines from streambuf has a number of redundant checks. I think it could be both simplified and optimized along these lines:
bufend = '' streambuf.each_line do |line| if not line.end_with? "\n" bufend = line break # rate_limit returns true or false as to whether to actually log # the line or not. It also modifies "line" in place to replace # it with an error if a logging limit is tripped. elsif rate_limit j, line $stderr.print "#{job_uuid} ! " unless line.index(job_uuid) $stderr.puts line pub_msg = "#{Time.now.ctime.to_s} #{line.strip}\n" j[:stderr_buf_to_flush] << pub_msg end end streambuf.replace bufend
Thanks.
Updated by Tom Clegg about 10 years ago
Brett Smith wrote:
Reviewing 2e76e6a. I mostly have questions.
- There's a lot of code cleanup here, which provides little improvements like avoiding duplicate method calls and conditional checks, but it looks like most of the performance improvement comes from increasing the maximum read buffer size. Is that about right? (Thank you for commenting that so clearly.)
There were a bunch of little things (like having one flag saying "not logging now"). I think the other big improvement (~3x?) was the "short circuit": skipping the "split buffer into N lines and call rate_limit() and write_log(j) N times" stuff during intervals when we're refusing to output anything anyway.
I ended up making a variety of readability and bug fixes too though.
- Why is this branch removing
crunch_limit_log_events_per_job
?
I noticed a bug (which had been fixed before but reintroduced when read_pipes got refactored) that hitting crunch_limit_log_events_per_job
would actually result in the silenced logs being buffered in RAM. While trying to identify a scenario where this could happen, I noticed that the relationship between the old events_per_job
limit and the new bytes_per_job
, bytes_per_interval
, and lines_per_interval
limits is hard to pin down, and unpredictable at best. I figured (and Ward didn't disagree) the new limits alone give enough control, and are easy to understand.
- As a suggestion in the spirit of your code cleanup, the loop in
read_pipes
that reads lines from streambuf has a number of redundant checks. I think it could be both simplified and optimized along these lines:
[...]
Yes, that's easier to follow, thanks. Done.
Also noticed & fixed an OOM opportunity (no newline → infinite buffering).
Updated by Brett Smith about 10 years ago
Reviewing de79b73
Tom Clegg wrote:
Brett Smith wrote:
- There's a lot of code cleanup here, which provides little improvements like avoiding duplicate method calls and conditional checks, but it looks like most of the performance improvement comes from increasing the maximum read buffer size. Is that about right? (Thank you for commenting that so clearly.)
There were a bunch of little things (like having one flag saying "not logging now"). I think the other big improvement (~3x?) was the "short circuit": skipping the "split buffer into N lines and call rate_limit() and write_log(j) N times" stuff during intervals when we're refusing to output anything anyway.
Thanks. Looking at the short-circuit condition again, is it possible we'll double-count some buf
bytes as skipped? It looks to me like if the throttle is closed, the "each buffer" loop will read a new buf
, tally its bytes as skipped under if not j[:log_throttle_is_open]
, then append it to streambuf
. Anything in there that isn't line-delimited will stay in streambuf
, where it will be counted again under if not j[:log_throttle_is_open]
next time. Am I missing something?
It feels like there should be a next
under if not j[:log_throttle_is_open]
, although that might not actually be correct.
- As a suggestion in the spirit of your code cleanup, the loop in
read_pipes
that reads lines from streambuf has a number of redundant checks. I think it could be both simplified and optimized along these lines:
[...]Yes, that's easier to follow, thanks. Done.
Sure. I wouldn't bring this up if the point of this branch wasn't optimization, but according to the docs, lines
is just short for each_line.to_a
. Why add the overhead of building an array that we only enumerate? If you want do specify the line separator, you can do that with each_line
too.
Also noticed & fixed an OOM opportunity (no newline → infinite buffering).
There's a small typo at the end of the comments you added here: "some indication that it [sic] incomplete."
Updated by Tom Clegg about 10 years ago
Brett Smith wrote:
Thanks. Looking at the short-circuit condition again, is it possible we'll double-count some
buf
bytes as skipped? It looks to me like if the throttle is closed, the "each buffer" loop will read a newbuf
, tally its bytes as skipped underif not j[:log_throttle_is_open]
, then append it tostreambuf
. Anything in there that isn't line-delimited will stay instreambuf
, where it will be counted again underif not j[:log_throttle_is_open]
next time. Am I missing something?It feels like there should be a
next
underif not j[:log_throttle_is_open]
, although that might not actually be correct.
You're right, there should be a "next" there. Not only does that fix the accounting, it actually gives the kind of performance I was hoping for when I put it there: now crunch-dispatch
uses about 1.5× as much CPU as yes
in my test case (was 30× before this branch).
Sure. I wouldn't bring this up if the point of this branch wasn't optimization, but according to the docs,
lines
is just short foreach_line.to_a
. Why add the overhead of building an array that we only enumerate? If you want do specify the line separator, you can do that witheach_line
too.
Oops, I just missed copying that from your example. Fixed now. Looping with each_line
is certainly better than constructing an array. I also used $/
instead of "\n"
in the following line, since that's what each_line
uses by default and is, in fact, what we mean by "\n"
. (Right?)
There's a small typo at the end of the comments you added here: "some indication that it [sic] incomplete."
Fixed.
Now at 6ad9adc
Updated by Brett Smith about 10 years ago
Tom Clegg wrote:
Looping with
each_line
is certainly better than constructing an array. I also used$/
instead of"\n"
in the following line, since that's whateach_line
uses by default and is, in fact, what we mean by"\n"
. (Right?)
Yeah, that makes sense to me. 6ad9adc1 is good to merge, thanks.
Updated by Tom Clegg about 10 years ago
- Status changed from In Progress to Resolved