Bug #17770
closed[a-d-c] while in throttle mode, the logs are full of "creating new instance" lines
Description
When a-d-c gets a cloud error that indicates throttling, it uses a backoff algorithm. This works as expected, but the logs are cluttered with "creating new instance" lines, e.g.:
Jun 08 17:08:13 2xpu4.arvadosapi.com arvados-dispatch-cloud[22656]: {"ContainerUUID":"2xpu4-dz642-egcge6nlcyxnhsf","InstanceType":"m58xlarge.spot","PID":22656,"level":"info","msg":"creating new instance","time":"2021-06-08T17:08:13.598610198Z"} Jun 08 17:08:13 2xpu4.arvadosapi.com arvados-dispatch-cloud[22656]: {"ContainerUUID":"2xpu4-dz642-aexvhfezlpd2u71","InstanceType":"m58xlarge.spot","PID":22656,"level":"info","msg":"creating new instance","time":"2021-06-08T17:08:13.598616679Z"} Jun 08 17:08:13 2xpu4.arvadosapi.com arvados-dispatch-cloud[22656]: {"ContainerUUID":"2xpu4-dz642-xhpx06i0ao54a3a","InstanceType":"m58xlarge.spot","PID":22656,"level":"info","msg":"creating new instance","time":"2021-06-08T17:08:13.598623987Z"} Jun 08 17:08:13 2xpu4.arvadosapi.com arvados-dispatch-cloud[22656]: {"ContainerUUID":"2xpu4-dz642-p8o0tdro3r7p2y2","InstanceType":"m58xlarge.spot","PID":22656,"level":"info","msg":"creating new instance","time":"2021-06-08T17:08:13.598630714Z"} Jun 08 17:08:13 2xpu4.arvadosapi.com arvados-dispatch-cloud[22656]: {"ContainerUUID":"2xpu4-dz642-91mie2u01zeushw","InstanceType":"m58xlarge.spot","PID":22656,"level":"info","msg":"creating new instance","time":"2021-06-08T17:08:13.598637687Z"} Jun 08 17:08:13 2xpu4.arvadosapi.com arvados-dispatch-cloud[22656]: {"ContainerUUID":"2xpu4-dz642-596f3cnlaxlgo7k","InstanceType":"m58xlarge.spot","PID":22656,"level":"info","msg":"creating new instance","time":"2021-06-08T17:08:13.705535475Z"} Jun 08 17:08:13 2xpu4.arvadosapi.com arvados-dispatch-cloud[22656]: {"ContainerUUID":"2xpu4-dz642-oyclj45rigf588i","InstanceType":"m58xlarge.spot","PID":22656,"level":"info","msg":"creating new instance","time":"2021-06-08T17:08:13.705662410Z"} Jun 08 17:08:13 2xpu4.arvadosapi.com arvados-dispatch-cloud[22656]: {"ContainerUUID":"2xpu4-dz642-30jr9xd0qpazpgb","InstanceType":"m58xlarge.spot","PID":22656,"level":"info","msg":"creating new instance","time":"2021-06-08T17:08:13.705698025Z"} Jun 08 17:08:13 2xpu4.arvadosapi.com arvados-dispatch-cloud[22656]: {"ContainerUUID":"2xpu4-dz642-91mie2u01zeushw","InstanceType":"m58xlarge.spot","PID":22656,"level":"info","msg":"creating new instance","time":"2021-06-08T17:08:13.705722414Z"} Jun 08 17:08:13 2xpu4.arvadosapi.com arvados-dispatch-cloud[22656]: {"ContainerUUID":"2xpu4-dz642-ka4mirn7xrx7kng","InstanceType":"m58xlarge.spot","PID":22656,"level":"info","msg":"creating new instance","time":"2021-06-08T17:08:13.705735563Z"} Jun 08 17:08:13 2xpu4.arvadosapi.com arvados-dispatch-cloud[22656]: {"ContainerUUID":"2xpu4-dz642-sgeierm62ho9isk","InstanceType":"m58xlarge.spot","PID":22656,"level":"info","msg":"creating new instance","time":"2021-06-08T17:08:13.705746712Z"} Jun 08 17:08:13 2xpu4.arvadosapi.com arvados-dispatch-cloud[22656]: {"ContainerUUID":"2xpu4-dz642-r0xkpp8ijfkunsx","InstanceType":"m58xlarge.spot","PID":22656,"level":"info","msg":"creating new instance","time":"2021-06-08T17:08:13.705758343Z"}
Updated by Peter Amstutz over 3 years ago
- Target version deleted (
To Be Groomed)
Updated by Tom Clegg over 3 years ago
In source:lib/dispatchcloud/scheduler/run_queue.go:
} else if logger.Info("creating new instance"); sch.pool.Create(it) {
// Success. (Note pool.Create works
// asynchronously and does its own
// logging, so we don't need to.)
Solution is probably be to call sch.pool.Create() first, then log "creating new instance" only if it returns true, i.e., the worker pool is trying to create a new instance in the background.
Updated by Ward Vandewege over 3 years ago
Tom Clegg wrote:
In source:lib/dispatchcloud/scheduler/run_queue.go:
[...]
Solution is probably be to call sch.pool.Create() first, then log "creating new instance" only if it returns true, i.e., the worker pool is trying to create a new instance in the background.
Oh, nice. I did that in 9792171cd97a576c34d85291ff64cbc29061fc08 and am running tests at developer-run-tests: #2693 . Tests passed, I talked to Tom and am merging.
Updated by Ward Vandewege over 3 years ago
- Target version set to 2021-09-29 sprint
- Assigned To set to Ward Vandewege
- Status changed from New to In Progress
Updated by Ward Vandewege over 3 years ago
- % Done changed from 0 to 100
- Status changed from In Progress to Resolved
Applied in changeset arvados|dee20da037920692508446dc5f10d71ee3ab1524.