Project

General

Profile

Actions

Bug #13959

closed

crunch-dispatch-slurm / Go SDK Dispatcher can block indefinitely on d.Arv.List("containers", params, &list)

Added by Joshua Randall over 5 years ago. Updated over 5 years ago.

Status:
Resolved
Priority:
Normal
Assigned To:
Category:
Crunch
Target version:
Story points:
-
Release:
Release relationship:
Auto

Description

In the main loop of the Dispatcher's checkForUpdates function, the API List request to get a batch of matching containers appears to sometimes block forever. I'm not sure why this happens, but it is probably due to a network or API server issue. In any case, there should be some client side timeout that prevents this loop from hanging.

Our current workaround is to have a cron job that calls `systemctl restart crunch-dispatch-slurm` on an hourly basis so that if the dispatcher gets hung it will be fixed at the next top of the hour.


Subtasks 1 (0 open1 closed)

Task #14055: Review 13959-timeouts-and-loggingResolvedTom Clegg08/21/2018Actions
Actions

Also available in: Atom PDF