Bug #3769

[API] In crunch-dispatch, throttle by bytes_per_minute or _node_minute

Added by Ward Vandewege over 5 years ago. Updated about 5 years ago.

Status:
Resolved
Priority:
Normal
Assigned To:
Category:
Crunch
Target version:
Start date:
10/01/2014
Due date:
% Done:

100%

Estimated time:
(Total: 0.00 h)
Story points:
1.0

Description

crunch_limit_log_event_bytes_per_job is currently a hard-coded number in the server config. It would be much more useful to have a number that is proportional to the number of task hours in the job. That way, long-running or big jobs are allowed more log output.

If that's easier, adding a crunch_limit_log_event_bytes_per_task field could be good enough, then I could leave crunch_limit_log_event_bytes_per_job unlimited (or very high).

Implementation:
  • Add a crunch_limit_log_event_bytes_per_minute config to services/api/config/application.default.yml
  • In services/api/script/crunch-dispatch.rb, temporarily stop propagating log messages to the logs table when the per-minute threshold is exceeded.
    • This doesn't have to be perfect: dividing the job time into 60-second chunks and throttling each chunk independently should be good enough, even though it means a job can exceed the threshold between t=30s and t=90s. (At worst, a job can exceed the threshold by a factor of 2 this way, and can't exceed it continuously.)
    • When the limit is hit, emit "[log messages suppressed for N seconds]" to the logs table (N = seconds until next 60-second interval starts).
    • When resuming logging, emit "[N bytes of log messages skipped]" to the logs table.

Subtasks

Task #4061: Review 3769-throttle-logsResolvedWard Vandewege

Task #3909: Add throttle logicResolvedPeter Amstutz


Related issues

Related to Arvados - Bug #10120: [Crunch] crunch-dispatch log throttling should not apply to its own stderrNew09/22/2016

Associated revisions

Revision 60998a38
Added by Peter Amstutz about 5 years ago

Merge branch '3769-throttle-logs' closes #3769

History

#1 Updated by Tim Pierce over 5 years ago

  • Description updated (diff)

#2 Updated by Peter Amstutz about 5 years ago

  • Target version set to Arvados Future Sprints

#3 Updated by Tom Clegg about 5 years ago

  • Subject changed from [API] crunch_limit_log_event_bytes_per_job is not smart enough to [API] In crunch-dispatch, throttle by bytes_per_minute or _node_minute
  • Description updated (diff)
  • Category set to Crunch

#4 Updated by Tom Clegg about 5 years ago

  • Target version changed from Arvados Future Sprints to 2014-10-08 sprint

#5 Updated by Peter Amstutz about 5 years ago

  • Assigned To set to Peter Amstutz

#6 Updated by Peter Amstutz about 5 years ago

  • Story points changed from 0.5 to 1.0

#7 Updated by Peter Amstutz about 5 years ago

Should it print "log messages silenced" every 60 seconds as long is the rate is being exceeded, or only when at the start?

#8 Updated by Peter Amstutz about 5 years ago

  • Status changed from New to In Progress

#9 Updated by Ward Vandewege about 5 years ago

services/api/config/application.default.yml

+ # database. Logs lines are buffered until either crunch_log_bytes_per_event

Make that "Log lines".

+ # has been reached or crunch_log_seconds_between_events has ellapsed since

One l in elapsed.

in services/api/script/crunch-dispatch.rb

+ #puts "Handle line at #{now - running_job[:log_throttle_timestamp]}, buf bytes #{line.size}, so far #{running_job[:log_throttle_bytes_so_far]}, throttled #{running_job[:log_throttl
+

That line can go I think?

Other than that, the code looks like it should do what the description says. I assume you've tested this locally?

#10 Updated by Anonymous about 5 years ago

  • Status changed from In Progress to Resolved

Applied in changeset arvados|commit:60998a3875f79482533976e6e0ee0f99a9589c46.

Also available in: Atom PDF