Bug #9996
closed[keep-balance] Stop retrieving collections from API if the run is going to be aborted anyway
Description
Background¶
Currently, if one of the srv.Index()
goroutines encounters an error, GetCurrentState returns, but its other goroutines keep running. This is wasteful (it puts load on the API server, and the results will never be used) and makes logs confusing (you can get interleaved "collections: x/y" messages from the doomed run and a subsequent run).
The EachCollection loop already checks len(errs)>0, but len(errs)>0 is only true for a very short time after the first error because "return <-err" consumes it. Therefore, if only one error happens, the EachCollection loop probably won't realize that it should stop.
Proposed fix¶
At the end of GetCurrentState, don't call wg.Wait() from a goroutine and rely on errs to decide when to return. Instead, call wg.Wait() and then check len(errs) to decide whether to return <-err
or nil.
Updated by Tom Clegg over 8 years ago
- Status changed from New to Resolved
- % Done changed from 0 to 100
Applied in changeset arvados|commit:7846843453df9846c346f85c20a8d6d051066f52.