Project

General

Profile

Actions

Idea #8543

closed

[NodeManager] Don't use Futures when not expecting a reply

Added by Peter Amstutz about 8 years ago. Updated about 8 years ago.

Status:
Resolved
Priority:
Normal
Assigned To:
Category:
Node Manager
Target version:
Start date:
03/04/2016
Due date:
Story points:
1.0

Description

Quoting #8437

on_failure is only called when there is no future associated with the message. As it turns out, all calls that use ActorProxy have an associated Future object, and all messaging between actors in node manager uses ActorProxy. This means unhandled exceptions are stored in a Future object to be returned to the caller. However, if the caller never calls get() on the Future object (because it never stored it), this means the exception is silently ignored.

These lingering future objects may also be creating circular references that is causing the memory leak.

Anywhere a message is sent between Actors where a response is not required, such as subscriptions, use tell() on the ActorRef object instead of using ActorProxy (which uses ask()). (May want to write a helper similar to ActorProxy, such as TellActorProxy).

Anticipated benefits:

  • Unhandled exceptions will be logged, and the on_failure method of the Actor be called as intended
  • Possibly uncover previously unknown bugs (previously hidden due to exceptions being lost)
  • May mitigate longstanding memory leak bug by avoiding creating Future objects that are not needed (and which may be actively harmful)

Subtasks 4 (0 open4 closed)

Task #8604: Use TellActorProxy when future is not going to be usedResolvedPeter Amstutz03/04/2016Actions
Task #8602: Review 8543-nodemanager-fewer-futuresResolvedPeter Amstutz03/07/2016Actions
Task #8605: Fix testsResolvedPeter Amstutz03/04/2016Actions
Task #8601: Implement TellActorProxy ResolvedPeter Amstutz03/04/2016Actions

Related issues

Related to Arvados - Bug #8541: [NodeManager] Use sys.exc_clear() to release exception tracebacksClosedActions
Related to Arvados - Bug #7026: [Node Manager] Mishandles stop signalsClosed08/19/2015Actions
Actions

Also available in: Atom PDF