Project

General

Profile

Actions

Bug #9120

closed

[Node Manager] AttributeError: 'ComputeNodeDriver' object has no attribute 'ex_list_networks'

Added by Nico César over 8 years ago. Updated over 8 years ago.

Status:
Resolved
Priority:
Normal
Assigned To:
Category:
-
Target version:
Story points:
-

Description

this happened in qr2hi (GPC) with version 0.1.20160421180254-1

2016-04-29_21:05:08.35533 2016-04-29 21:05:08 root[2764] ERROR: Uncaught exception during setup
2016-04-29_21:05:08.35536 Traceback (most recent call last):
2016-04-29_21:05:08.35536   File "/usr/local/lib/python2.7/dist-packages/arvnodeman/launcher.py", line 110, in main
2016-04-29_21:05:08.35537     server_calculator = build_server_calculator(config)
2016-04-29_21:05:08.35537   File "/usr/local/lib/python2.7/dist-packages/arvnodeman/launcher.py", line 61, in build_server_calculator
2016-04-29_21:05:08.35537     cloud_size_list = config.node_sizes(config.new_cloud_client().list_sizes())
2016-04-29_21:05:08.35538   File "/usr/local/lib/python2.7/dist-packages/arvnodeman/config.py", line 107, in new_cloud_client
2016-04-29_21:05:08.35538     self.get_section('Cloud Create'))
2016-04-29_21:05:08.35538   File "/usr/local/lib/python2.7/dist-packages/arvnodeman/computenode/driver/gce.py", line 36, in __init__
2016-04-29_21:05:08.35539     driver_class)
2016-04-29_21:05:08.35539   File "/usr/local/lib/python2.7/dist-packages/arvnodeman/computenode/driver/__init__.py", line 68, in __init__
2016-04-29_21:05:08.35539     new_pair = init_method(self.create_kwargs.pop(key))
2016-04-29_21:05:08.35539   File "/usr/local/lib/python2.7/dist-packages/arvnodeman/computenode/driver/gce.py", line 51, in _init_network
2016-04-29_21:05:08.35540     network_name, 'ex_list_networks', self._name_key)
2016-04-29_21:05:08.35540   File "/usr/local/lib/python2.7/dist-packages/arvnodeman/computenode/driver/__init__.py", line 113, in search_for
2016-04-29_21:05:08.35541     term, list_method, key, **kwargs)
2016-04-29_21:05:08.35541   File "/usr/local/lib/python2.7/dist-packages/arvnodeman/computenode/driver/__init__.py", line 95, in search_for_now
2016-04-29_21:05:08.35541     items = getattr(self, list_method)(**kwargs)
2016-04-29_21:05:08.35542 AttributeError: 'ComputeNodeDriver' object has no attribute 'ex_list_networks'
2016-04-29_21:05:08.36902 Stopping arvados-node-manager

Subtasks 1 (0 open1 closed)

Task #9124: Review 9120-node-manager-search-ex-methods-wipResolvedPeter Amstutz05/02/2016Actions
Actions #1

Updated by Nico César over 8 years ago

  • Project changed from 40 to Arvados
Actions #3

Updated by Brett Smith over 8 years ago

  • Subject changed from AttributeError: 'ComputeNodeDriver' object has no attribute 'ex_list_networks' to [Node Manager] AttributeError: 'ComputeNodeDriver' object has no attribute 'ex_list_networks'
  • Status changed from New to In Progress
  • Assigned To set to Brett Smith
  • Target version set to 2016-05-11 sprint
Actions #4

Updated by Peter Amstutz over 8 years ago

LGTM

Actions #5

Updated by Brett Smith over 8 years ago

  • Status changed from In Progress to Resolved

Applied in changeset arvados|commit:497fdb2505efa9a3231c39ec696da6b749d30af2.

Actions #6

Updated by Nico César over 8 years ago

deployed in qr2hi. Works as expected: doesn't blow up-

but it brought 2 nodes when needed only 1: https://workbench.qr2hi.arvadosapi.com/pipeline_instances/qr2hi-d1hrv-78qs9xv7ycr2j6s

new bug?

Actions #7

Updated by Brett Smith over 8 years ago

Nico Cesar wrote:

but it brought 2 nodes when needed only 1: https://workbench.qr2hi.arvadosapi.com/pipeline_instances/qr2hi-d1hrv-78qs9xv7ycr2j6s

new bug?

Node Manager can be in a state where it gets an updated job queue before it gets an updated node list. If the timing is just right, it can see that there's a new job in the queue, but sees the node as still busy with the previous job in the pipeline (that just finished). In that case, it will boot a new node, even though a complete snapshot of the all the system states would show it's not necessary.

This has been true forever, so it's not a "new" bug, no.

Actions #8

Updated by Nico César over 8 years ago

Brett Smith wrote:

Node Manager can be in a state where it gets an updated job queue before it gets an updated node list. If the timing is just right, it can see that there's a new job in the queue, but sees the node as still busy with the previous job in the pipeline (that just finished). In that case, it will boot a new node, even though a complete snapshot of the all the system states would show it's not necessary.

This has been true forever, so it's not a "new" bug, no.

Mhh... it happened 2 out of 2 times with the new version. will tests a couple more and reopen #9161 if this is the case

Actions

Also available in: Atom PDF