Bug #14911
closed[dispatch-cloud] Azure driver panic at startup
Description
arvados-dispatch-cloud 1.3.1.20190301145524-1 seems unable to start on c97qk. Systemd unit is in a restart loop.
arvados-dispatch-cloud.service: Service hold-off time over, scheduling restart. Stopped arvados-dispatch-cloud. Starting arvados-dispatch-cloud... panic: nil context goroutine 1 [running]: net/http.(*Request).WithContext(0xc4201ca820, 0x0, 0x0, 0x9e7440) /usr/local/go/src/net/http/request.go:331 +0x1ac git.curoverse.com/arvados.git/vendor/github.com/Azure/azure-sdk-for-go/services/storage/mgmt/2018-02-01/storage.AccountsClient.ListKeysPreparer(0x9e6460, 0xc420203bc0, 0x9e6a80, 0xc42020be30, 0x0, 0x0, 0xdf8475800, 0xd18c2e2800, 0x3, 0x6fc23ac00, ...) /tmp/tmp.13cJ0hE7QZ/src/git.curoverse.com/arvados.git/vendor/github.com/Azure/azure-sdk-for-go/services/storage/mgmt/2018-02-01/storage/accounts.go:676 +0x594 git.curoverse.com/arvados.git/vendor/github.com/Azure/azure-sdk-for-go/services/storage/mgmt/2018-02-01/storage.AccountsClient.ListKeys(0x9e6460, 0xc420203bc0, 0x9e6a80, 0xc42020be30, 0x0, 0x0, 0xdf8475800, 0xd18c2e2800, 0x3, 0x6fc23ac00, ...) /tmp/tmp.13cJ0hE7QZ/src/git.curoverse.com/arvados.git/vendor/github.com/Azure/azure-sdk-for-go/services/storage/mgmt/2018-02-01/storage/accounts.go:637 +0x507 git.curoverse.com/arvados.git/lib/cloud/azure.(*azureInstanceSet).setup(0xc4200e78c0, 0xc42008ee10, 0x24, 0xc42008edb0, 0x24, 0xc42008ede0, 0x2c, 0xc42008ee40, 0x24, 0xc420224150, ...) /tmp/tmp.13cJ0hE7QZ/src/git.curoverse.com/arvados.git/lib/cloud/azure/azure.go:260 +0x498 git.curoverse.com/arvados.git/lib/cloud/azure.newAzureInstanceSet(0xc420182780, 0x1d5, 0x1e0, 0xc420222000, 0x20, 0x9f2fc0, 0xc420099270, 0x7ad373, 0xc4200b3560, 0xc4200b3568, ...) /tmp/tmp.13cJ0hE7QZ/src/git.curoverse.com/arvados.git/lib/cloud/azure/azure.go:224 +0x196 git.curoverse.com/arvados.git/lib/cloud.driverFunc.InstanceSet(0x9a7280, 0xc420182780, 0x1d5, 0x1e0, 0xc420222000, 0x20, 0x9f2fc0, 0xc420099270, 0xc420216c00, 0x9e8c40, ...) /tmp/tmp.13cJ0hE7QZ/src/git.curoverse.com/arvados.git/lib/cloud/interfaces.go:197 +0x7f git.curoverse.com/arvados.git/lib/dispatchcloud.newInstanceSet(0xc4201b32c0, 0xc420222000, 0x20, 0x9f2fc0, 0xc420099270, 0x0, 0x0, 0xffffffffffffffff, 0x0) /tmp/tmp.13cJ0hE7QZ/src/git.curoverse.com/arvados.git/lib/dispatchcloud/driver.go:25 +0xe7 git.curoverse.com/arvados.git/lib/dispatchcloud.(*dispatcher).initialize(0xc4200d6dc0) /tmp/tmp.13cJ0hE7QZ/src/git.curoverse.com/arvados.git/lib/dispatchcloud/dispatcher.go:130 +0x2e4 git.curoverse.com/arvados.git/lib/dispatchcloud.(*dispatcher).setup(0xc4200d6dc0) /tmp/tmp.13cJ0hE7QZ/src/git.curoverse.com/arvados.git/lib/dispatchcloud/dispatcher.go:105 +0x2b git.curoverse.com/arvados.git/lib/dispatchcloud.(*dispatcher).(git.curoverse.com/arvados.git/lib/dispatchcloud.setup)-fm() /tmp/tmp.13cJ0hE7QZ/src/git.curoverse.com/arvados.git/lib/dispatchcloud/dispatcher.go:66 +0x2a sync.(*Once).Do(0xc4200d6e50, 0xc4201cbc70) /usr/local/go/src/sync/once.go:44 +0xbe git.curoverse.com/arvados.git/lib/dispatchcloud.(*dispatcher).Start(0xc4200d6dc0) /tmp/tmp.13cJ0hE7QZ/src/git.curoverse.com/arvados.git/lib/dispatchcloud/dispatcher.go:66 +0x4e git.curoverse.com/arvados.git/lib/dispatchcloud.(*dispatcher).CheckHealth(0xc4200d6dc0, 0xc42020bb90, 0xc4201b32c0) /tmp/tmp.13cJ0hE7QZ/src/git.curoverse.com/arvados.git/lib/dispatchcloud/dispatcher.go:77 +0x2b git.curoverse.com/arvados.git/lib/service.(*command).RunCommand(0xc4201b4a60, 0x7ffd8d024eb9, 0x1f, 0xc420086170, 0x0, 0x0, 0x9e6d60, 0xc420096000, 0x9e6d80, 0xc420096008, ...) /tmp/tmp.13cJ0hE7QZ/src/git.curoverse.com/arvados.git/lib/service/cmd.go:95 +0xa63 git.curoverse.com/arvados.git/lib/cmd.Multi.RunCommand(0xc420170cc0, 0x7ffd8d024eb9, 0x1f, 0xc420086170, 0x0, 0x0, 0x9e6d60, 0xc420096000, 0x9e6d80, 0xc420096008, ...) /tmp/tmp.13cJ0hE7QZ/src/git.curoverse.com/arvados.git/lib/cmd/cmd.go:60 +0x55e main.main() /tmp/tmp.13cJ0hE7QZ/src/git.curoverse.com/arvados.git/cmd/arvados-server/cmd.go:28 +0xc9 arvados-dispatch-cloud.service: Main process exited, code=exited, status=2/INVALIDARGUMENT Failed to start arvados-dispatch-cloud. arvados-dispatch-cloud.service: Unit entered failed state. arvados-dispatch-cloud.service: Failed with result 'exit-code'.
Updated by Tom Clegg almost 6 years ago
Looks like #14844 moved an Azure API call into the driver initialization func which, as a result, uses az.ctx before it's assigned.
With az.ctx initialization moved up, dispatcher starts up successfully on c97qk.
14911-azure-panic @ f1d38ded6991f7061e2e4d9f61db3501c9c90b2b
Updated by Eric Biagiotti almost 6 years ago
Tom Clegg wrote:
Looks like #14844 moved an Azure API call into the driver initialization func which, as a result, uses az.ctx before it's assigned.
With az.ctx initialization moved up, dispatcher starts up successfully on c97qk.
14911-azure-panic @ f1d38ded6991f7061e2e4d9f61db3501c9c90b2b
LGTM, but while figuring out what context
does, I came across the following on https://golang.org/pkg/context/
Do not store Contexts inside a struct type; instead, pass a Context explicitly to each function that needs it. The Context should be the first parameter, typically named ctx
...
Obviously there are exceptions to recommendations like this and I haven't figured out exactly how we use context
, but I just wanted to point this out.
Updated by Tom Clegg almost 6 years ago
Eric Biagiotti wrote:
LGTM, but while figuring out what
context
does, I came across the following on https://golang.org/pkg/context/
Do not store Contexts inside a struct type; instead, pass a Context explicitly to each function that needs it. The Context should be the first parameter, typically named ctx
...Obviously there are exceptions to recommendations like this and I haven't figured out exactly how we use
context
, but I just wanted to point this out.
Yes -- stdlib breaks this rule in http.Request, but adding a context to a struct is surely a flag that you might be doing it wrong. Here, arguably the driver should accept a context in List/Create/Destroy calls, leaving the caller responsible for cancelling when it wants to shut everything down.
Updated by Tom Clegg almost 6 years ago
- Status changed from In Progress to Resolved
Applied in changeset arvados|b20862479a05afec430f10067b8b417418b11eee.