Actions
Bug #9018
closed[Node manager] exception handler should not kill parent process
Story points:
-
Description
A race condition in test_fatal_error (tests.test_failure.ActorUnhandledExceptionTest) causes os.killpg() to be called after it has been unstubbed. This kills the test suite and run-tests.sh.
There are two problems here:- The test should not have a race condition
- The exception handler should only kill node manager itself, not other processes.
Proposed fix for overkill¶
Use os._exit() or os.kill(0,9) instead of os.killpg()
Proposed fix for test race¶
TBD?
Updated by Tom Clegg almost 9 years ago
- Description updated (diff)
- Category set to Node Manager
Updated by Brett Smith almost 9 years ago
- Target version set to Arvados Future Sprints
Updated by Peter Amstutz almost 9 years ago
- Target version changed from Arvados Future Sprints to 2016-05-25 sprint
Updated by Peter Amstutz almost 9 years ago
- Status changed from New to Resolved
- % Done changed from 0 to 100
Applied in changeset arvados|commit:aea5300167770beb3cca6ad90e5ebb04da961416.
Updated by Tom Clegg almost 9 years ago
The test race might still exist. However, it hasn't been seen recently, so maybe some other changes have fixed it by accident.
(11:07:12) tetron_: I haven't seen the race condition happen (11:07:59) tetron_: and I haven't been able to work out a sequence that would cause it to happen (11:10:51) tetron_: I believe the race only happens if the test also fails for some other reason and it's unable to wait for the actor to stop
Actions