Actions
Bug #14844
closed[dispatch-cloud] Azure driver bugs discovered in trial run
Status:
Resolved
Priority:
Normal
Assigned To:
Category:
Crunch
Target version:
Story points:
1.0
Release:
Release relationship:
Auto
Description
- If creating a VM fails, an attempt should be made to delete the VM's dependent resources (nic/blob) before returning the error to Create()'s caller. As it stands, an unbounded number of new unused nics and blobs pile up during times when VMs can't be created and the dispatcher keeps retrying.
- nil pointer panic in (*AzureInstance)Address() -- perhaps a newly created instance that has no IP address assigned yet (see note)
Updated by Tom Clegg almost 6 years ago
- Related to Idea #13908: [Epic] Replace SLURM for cloud job scheduling/dispatching added
Updated by Tom Clegg almost 6 years ago
panic: runtime error: invalid memory address or nil pointer dereference [signal SIGSEGV: segmentation violation code=0x1 addr=0x10 pc=0x83aab5] goroutine 102 [running]: git.curoverse.com/arvados.git/lib/cloud.(*AzureInstance).Address(0xc420478500, 0x7f16da9a9628, 0xc420478500) /GOPATH/src/git.curoverse.com/arvados.git/lib/cloud/azure.go:633 +0x15 git.curoverse.com/arvados.git/lib/dispatchcloud/ssh_executor.(*Executor).setupSSHClient(0xc420368ea0, 0xc42061a6e7, 0xc420368e01, 0xc4204b88a0) /GOPATH/src/git.curoverse.com/arvados.git/lib/dispatchcloud/ssh_executor/executor.go:178 +0x61 git.curoverse.com/arvados.git/lib/dispatchcloud/ssh_executor.(*Executor).sshClient(0xc420368ea0, 0x1, 0x0, 0x0, 0x0) /GOPATH/src/git.curoverse.com/arvados.git/lib/dispatchcloud/ssh_executor/executor.go:153 +0x10f git.curoverse.com/arvados.git/lib/dispatchcloud/ssh_executor.(*Executor).newSession.func1(0x8f7c01, 0x0, 0x9ebaa0, 0xc4204b88b0) /GOPATH/src/git.curoverse.com/arvados.git/lib/dispatchcloud/ssh_executor/executor.go:128 +0x37 git.curoverse.com/arvados.git/lib/dispatchcloud/ssh_executor.(*Executor).newSession(0xc420368ea0, 0x0, 0x8e5c40, 0xc420253710) /GOPATH/src/git.curoverse.com/arvados.git/lib/dispatchcloud/ssh_executor/executor.go:136 +0xa0 git.curoverse.com/arvados.git/lib/dispatchcloud/ssh_executor.(*Executor).Execute(0xc420368ea0, 0x0, 0xc4201df740, 0x19, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, ...) /GOPATH/src/git.curoverse.com/arvados.git/lib/dispatchcloud/ssh_executor/executor.go:92 +0x73 git.curoverse.com/arvados.git/lib/dispatchcloud/worker.(*worker).probeBooted(0xc4203b0b00, 0x989064, 0xa, 0x97c340, 0xc4204ed6e0) /GOPATH/src/git.curoverse.com/arvados.git/lib/dispatchcloud/worker/worker.go:349 +0x91 git.curoverse.com/arvados.git/lib/dispatchcloud/worker.(*worker).probeAndUpdate(0xc4203b0b00) /GOPATH/src/git.curoverse.com/arvados.git/lib/dispatchcloud/worker/worker.go:192 +0x1394 git.curoverse.com/arvados.git/lib/dispatchcloud/worker.(*worker).ProbeAndUpdate(0xc4203b0b00) /GOPATH/src/git.curoverse.com/arvados.git/lib/dispatchcloud/worker/worker.go:141 +0x57 created by git.curoverse.com/arvados.git/lib/dispatchcloud/worker.(*Pool).runProbes /GOPATH/src/git.curoverse.com/arvados.git/lib/dispatchcloud/worker/pool.go:636 +0x378
Evidently either IPConfigurations or PrivateIPAddress can be nil here:
func (ai *AzureInstance) Address() string {
return *(*ai.nic.IPConfigurations)[0].PrivateIPAddress
}
Updated by Tom Morris almost 6 years ago
- Target version changed from To Be Groomed to Arvados Future Sprints
- Story points set to 1.0
Updated by Tom Clegg almost 6 years ago
- Related to Idea #14807: [arvados-dispatch-cloud] Features/fixes needed before first production deploy added
Updated by Tom Morris almost 6 years ago
- Target version changed from Arvados Future Sprints to 2019-03-13 Sprint
Updated by Peter Amstutz almost 6 years ago
14844-cdc-azure-fixes @ 8c4fb97b1d34b5f8fc50d239698a08c35a63dac3
- If PrivateIPAddress somehow isn't defined, return empty string (don't panic)
- If VM create fails, attempt to immediately clean the VHD and NIC corresponding to that VM (if it doesn't work, cleanup processes should still get around to it.)
Updated by Peter Amstutz almost 6 years ago
- Status changed from New to Resolved
- % Done changed from 0 to 100
Applied in changeset arvados|a310d114bdc06b20cd007e6aff14b409e1c11e32.
Actions