Project

General

Profile

Actions

Bug #9996

closed

[keep-balance] Stop retrieving collections from API if the run is going to be aborted anyway

Added by Tom Clegg about 8 years ago. Updated about 8 years ago.

Status:
Resolved
Priority:
Normal
Assigned To:
-
Category:
Keep
Target version:
-
Story points:
-

Description

Background

Currently, if one of the srv.Index() goroutines encounters an error, GetCurrentState returns, but its other goroutines keep running. This is wasteful (it puts load on the API server, and the results will never be used) and makes logs confusing (you can get interleaved "collections: x/y" messages from the doomed run and a subsequent run).

The EachCollection loop already checks len(errs)>0, but len(errs)>0 is only true for a very short time after the first error because "return <-err" consumes it. Therefore, if only one error happens, the EachCollection loop probably won't realize that it should stop.

Proposed fix

At the end of GetCurrentState, don't call wg.Wait() from a goroutine and rely on errs to decide when to return. Instead, call wg.Wait() and then check len(errs) to decide whether to return <-err or nil.


Related issues

Related to Arvados - Bug #9918: keep-balance fails with "Malformed index line" errorResolvedTom Clegg09/01/2016Actions
Actions #1

Updated by Tom Clegg about 8 years ago

  • Status changed from New to Resolved
  • % Done changed from 0 to 100

Applied in changeset arvados|commit:7846843453df9846c346f85c20a8d6d051066f52.

Actions #2

Updated by Joshua Randall about 8 years ago

Thanks, Tom!

Actions

Also available in: Atom PDF