Project

General

Profile

Bug #6156

Updated by Tom Clegg almost 9 years ago

h2. Goals 

 Have the process documented in our "compute node install guide":http://doc.arvados.org/install/install-compute-node.html work out of the box.    Right now this doesn't work because when the node first pings the API server, #ping assigns a slot_number for the node, then overwrites the hostname based on that. 

 We can't just have the sysadmin assign a slot_number in the install process, because it needs to be unique and it's hard to figure out a general way to accommodate that constraint. 

 h2. Implementation 

 * Introduce a configuration variable to API server @assign_node_hostname@ that can be a format string (similar to the dns configs) to generate a compute node hostname given a slot_number.    It can also be set to false if the API server is not expected to generate hostnames. 
 * Update Node.hostname_for_slot to return a result based on the configured format string, or nil if the setting is unset. 
 * In Node#ping, assign a hostname if hostname is nil and @assign_node_hostname@ is set.    Move this code outside the @if self.slot_number.nil?@ block, below it—each value can be set independently when it's not included in the existing record. 
 * Only run the Node DNS checking code (below hostname_for_slot) when @assign_node_hostname@ is set. 

 In application.default.yml 
 * <pre> 
   # Hostname to assign to a compute node when it sends a "ping" and the 
   # hostname in its Node record is nil. 
   # During bootstrapping, the "ping" script is expected to notice the 
   # hostname given in the ping response, and update its unix hostname 
   # accordingly. 
   # If false, leave the hostname alone (this is appropriate if your compute 
   # nodes' hostnames are already assigned by some other mechanism). 
   # 
   # One way or another, the hostnames of your node records should agree 
   # with your DNS records and your /etc/slurm-llnl/slurm.conf files. 
   # 
   # Example for compute0000, compute0001, ....: "compute%<slot_number>04d" 
   # (See http://ruby-doc.org/core-2.2.2/Kernel.html#method-i-format for more.) 
   assign_node_hostname: "compute{slot_number}" 
 </pre> 

Back