Bug #17770

[a-d-c] while in throttle mode, the logs are full of "creating new instance" lines

Added by Ward Vandewege 3 months ago. Updated about 16 hours ago.

Status:
Resolved
Priority:
Normal
Assigned To:
Category:
-
Target version:
Start date:
Due date:
% Done:

100%

Estimated time:
Story points:
-

Description

When a-d-c gets a cloud error that indicates throttling, it uses a backoff algorithm. This works as expected, but the logs are cluttered with "creating new instance" lines, e.g.:

Jun 08 17:08:13 2xpu4.arvadosapi.com arvados-dispatch-cloud[22656]: {"ContainerUUID":"2xpu4-dz642-egcge6nlcyxnhsf","InstanceType":"m58xlarge.spot","PID":22656,"level":"info","msg":"creating new instance","time":"2021-06-08T17:08:13.598610198Z"}
Jun 08 17:08:13 2xpu4.arvadosapi.com arvados-dispatch-cloud[22656]: {"ContainerUUID":"2xpu4-dz642-aexvhfezlpd2u71","InstanceType":"m58xlarge.spot","PID":22656,"level":"info","msg":"creating new instance","time":"2021-06-08T17:08:13.598616679Z"}
Jun 08 17:08:13 2xpu4.arvadosapi.com arvados-dispatch-cloud[22656]: {"ContainerUUID":"2xpu4-dz642-xhpx06i0ao54a3a","InstanceType":"m58xlarge.spot","PID":22656,"level":"info","msg":"creating new instance","time":"2021-06-08T17:08:13.598623987Z"}
Jun 08 17:08:13 2xpu4.arvadosapi.com arvados-dispatch-cloud[22656]: {"ContainerUUID":"2xpu4-dz642-p8o0tdro3r7p2y2","InstanceType":"m58xlarge.spot","PID":22656,"level":"info","msg":"creating new instance","time":"2021-06-08T17:08:13.598630714Z"}
Jun 08 17:08:13 2xpu4.arvadosapi.com arvados-dispatch-cloud[22656]: {"ContainerUUID":"2xpu4-dz642-91mie2u01zeushw","InstanceType":"m58xlarge.spot","PID":22656,"level":"info","msg":"creating new instance","time":"2021-06-08T17:08:13.598637687Z"}
Jun 08 17:08:13 2xpu4.arvadosapi.com arvados-dispatch-cloud[22656]: {"ContainerUUID":"2xpu4-dz642-596f3cnlaxlgo7k","InstanceType":"m58xlarge.spot","PID":22656,"level":"info","msg":"creating new instance","time":"2021-06-08T17:08:13.705535475Z"}
Jun 08 17:08:13 2xpu4.arvadosapi.com arvados-dispatch-cloud[22656]: {"ContainerUUID":"2xpu4-dz642-oyclj45rigf588i","InstanceType":"m58xlarge.spot","PID":22656,"level":"info","msg":"creating new instance","time":"2021-06-08T17:08:13.705662410Z"}
Jun 08 17:08:13 2xpu4.arvadosapi.com arvados-dispatch-cloud[22656]: {"ContainerUUID":"2xpu4-dz642-30jr9xd0qpazpgb","InstanceType":"m58xlarge.spot","PID":22656,"level":"info","msg":"creating new instance","time":"2021-06-08T17:08:13.705698025Z"}
Jun 08 17:08:13 2xpu4.arvadosapi.com arvados-dispatch-cloud[22656]: {"ContainerUUID":"2xpu4-dz642-91mie2u01zeushw","InstanceType":"m58xlarge.spot","PID":22656,"level":"info","msg":"creating new instance","time":"2021-06-08T17:08:13.705722414Z"}
Jun 08 17:08:13 2xpu4.arvadosapi.com arvados-dispatch-cloud[22656]: {"ContainerUUID":"2xpu4-dz642-ka4mirn7xrx7kng","InstanceType":"m58xlarge.spot","PID":22656,"level":"info","msg":"creating new instance","time":"2021-06-08T17:08:13.705735563Z"}
Jun 08 17:08:13 2xpu4.arvadosapi.com arvados-dispatch-cloud[22656]: {"ContainerUUID":"2xpu4-dz642-sgeierm62ho9isk","InstanceType":"m58xlarge.spot","PID":22656,"level":"info","msg":"creating new instance","time":"2021-06-08T17:08:13.705746712Z"}
Jun 08 17:08:13 2xpu4.arvadosapi.com arvados-dispatch-cloud[22656]: {"ContainerUUID":"2xpu4-dz642-r0xkpp8ijfkunsx","InstanceType":"m58xlarge.spot","PID":22656,"level":"info","msg":"creating new instance","time":"2021-06-08T17:08:13.705758343Z"}

Associated revisions

Revision dee20da0
Added by Ward Vandewege about 16 hours ago

Merge branch '17770-fix-logging-while-throttled'

closes #17770

Arvados-DCO-1.1-Signed-off-by: Ward Vandewege <>

History

#1 Updated by Ward Vandewege 3 months ago

  • Description updated (diff)

#2 Updated by Peter Amstutz 2 months ago

  • Target version deleted (To Be Groomed)

#3 Updated by Tom Clegg about 18 hours ago

In source:lib/dispatchcloud/scheduler/run_queue.go:

                        } else if logger.Info("creating new instance"); sch.pool.Create(it) {
                                // Success. (Note pool.Create works                                                                           
                                // asynchronously and does its own                                                                            
                                // logging, so we don't need to.)                                                                             

Solution is probably be to call sch.pool.Create() first, then log "creating new instance" only if it returns true, i.e., the worker pool is trying to create a new instance in the background.

#4 Updated by Ward Vandewege about 17 hours ago

Tom Clegg wrote:

In source:lib/dispatchcloud/scheduler/run_queue.go:

[...]

Solution is probably be to call sch.pool.Create() first, then log "creating new instance" only if it returns true, i.e., the worker pool is trying to create a new instance in the background.

Oh, nice. I did that in 9792171cd97a576c34d85291ff64cbc29061fc08 and am running tests at https://ci.arvados.org/view/Developer/job/developer-run-tests/2693. Tests passed, I talked to Tom and am merging.

#5 Updated by Ward Vandewege about 16 hours ago

  • Target version set to 2021-09-29 sprint
  • Assigned To set to Ward Vandewege
  • Status changed from New to In Progress

#6 Updated by Ward Vandewege about 16 hours ago

  • % Done changed from 0 to 100
  • Status changed from In Progress to Resolved

Also available in: Atom PDF