Bug #14920

Updated by Tom Clegg almost 3 years ago

Currently, when a-d-c uses the Azure driver, new instances have state=unknown (instead of the expected state=booting) until the boot/run probes pass.

The "unknown" state is intended to cover the case where the "list instances" call returns a previously unseen instance ID. In the Azure case, the "create VM" call does not even return the ID of the newly created instance until the instance has finished booting, so until then, the dispatcher's worker pool doesn't recognize that it corresponds to an outstanding "create" call.

Some different ways to address this:
* In the Azure driver, return as soon as the instance ID is known, instead of waiting for it to boot. This is how the driver is expected to work, but the The Azure client library might not make it this easy.
* In the worker pool, when an unexpected instance ID appears, check whether its "secret token" tag matches an outstanding Create call. This would also cover the "list returns before create" race, which applies to all drivers.

The second option seems better.

It would also be worth documenting the expected driver behavior in the driver interface definition: Create() should generally return as soon as the new instance's ID is known, but must not return so early that a subsequent call to Instances() might not include the new instance.