Bug #14398

[keep-balance] deadlock on index retrieval error

Added by Tom Clegg 11 months ago. Updated 10 months ago.

Status:
Resolved
Priority:
Normal
Assigned To:
Category:
Keep
Target version:
Start date:
Due date:
% Done:

100%

Estimated time:
Story points:
-
Release:
Release relationship:
Auto

Description

The error handling in keep-balance's GetCurrentState func depends on the size of the "errs" channel exceeding the number of possible errors from various goroutines. It used to be correct when the number of goroutines was 2+nServers, but then the number of goroutines changed to 2+nMounts, and GetCurrentState deadlocks if more than 2+nMounts > 2+nServers and all mounts return errors (e.g., wrong auth token).

Rather than relying on the channel size, fix this by using channel size 1 and ignoring subsequent errors once the channel is full (they won't be reported anyway).

Associated revisions

Revision dd5efa11
Added by Tom Clegg 11 months ago

Merge branch '14398-error-deadlock'

fixes #14398

Arvados-DCO-1.1-Signed-off-by: Tom Clegg <>

History

#1 Updated by Tom Clegg 11 months ago

#2 Updated by Tom Clegg 11 months ago

  • Status changed from In Progress to Resolved
  • % Done changed from 0 to 100

#3 Updated by Tom Morris 10 months ago

  • Release set to 14

Also available in: Atom PDF