Idea #7454
closed
[NodeManager] Use CustomData on to provision compute nodes on Azure instead of CustomScriptExtension
Added by Peter Amstutz about 9 years ago.
Updated almost 9 years ago.
Description
Currently we use CustomScriptForLinux to provision compute nodes. However, for some reason (we have not been able to get to the root cause) it fails to run the script reliably. To get around this we are using a cron job hack which re-runs CustomScriptForLinux until it succeeds. It would be better to solve the problem a different way.
It turns out there is actually a simpler way to put a small file onto a newly provisioned node than what we have been trying to do with CustomScriptForLinux. Somehow I overlooked this feature before or I would have implemented this way originally.
https://azure.microsoft.com/en-us/documentation/articles/virtual-machines-how-to-inject-custom-data/
1) Add custom data support to libcloud (done)
2) Update node manager to put the ping URL command into custom data
3) in /etc/waagent.conf:
Provisioning.Enabled=y
Provisioning.DecodeCustomData=y
Provisioning.ExecuteCustomData=y
4) Build a new image.
- Tracker changed from Bug to Idea
- Description updated (diff)
- Description updated (diff)
- Target version set to Arvados Future Sprints
I've tested a new compute image with these specs on c97qk, and this change works fine. I've run a diagnostics job successfully on c97qk with the new image.
- Status changed from New to In Progress
- Target version changed from Arvados Future Sprints to 2015-12-16 sprint
- Assigned To set to Peter Amstutz
The docstring for BaseComputeNodeDriver.arvados_create_kwargs
should be updated to document the new size argument.
Nit: Perhaps it would be a little easier to read if arvados_create_kwargs() used the same argument order as create_node(), instead of swapping them here?
def create_node(self, size, arvados_node):
...
kwargs.update(self.arvados_create_kwargs(arvados_node, size))
Was this dropped deliberately?
Side note / existing smell: I wish we were using real shellescape instead of "knowing" there won't be any backslashes, quotation marks, dollar signs, etc. in instance_id, instance_type, or arv-ping-url... but at least we're doing a little better now without the bash -c '...'
layer.
Tom Clegg wrote:
The docstring for BaseComputeNodeDriver.arvados_create_kwargs
should be updated to document the new size argument.
Nit: Perhaps it would be a little easier to read if arvados_create_kwargs() used the same argument order as create_node(), instead of swapping them here?
Fixed.
Was this dropped deliberately?
Yes. Previously the bash command was constructed after the node was created, so we had access to the IP address. The new command is constructed before the node is created, so we don't know the address yet. Since discovering the IP address is trivial it's not something that needs to be recorded as metadata anyway.
Side note / existing smell: I wish we were using real shellescape instead of "knowing" there won't be any backslashes, quotation marks, dollar signs, etc. in instance_id, instance_type, or arv-ping-url... but at least we're doing a little better now without the bash -c '...'
layer.
Fixed to use pipes.quote()
(which is noted as deprecated in the Python documentation but the recommended replacement requires an additional package dependency or Python 3.2).
- Status changed from In Progress to Resolved
- % Done changed from 0 to 100
Applied in changeset arvados|commit:39ccab11524517c101fad39eab02603022f15a99.
Also available in: Atom
PDF