Story #7711

[Node Manager] Record a node's cloud cost information in the Arvados node record's properties

Added by Brett Smith almost 4 years ago. Updated over 3 years ago.

Status:
Resolved
Priority:
Normal
Assigned To:
Category:
Node Manager
Target version:
Start date:
11/03/2015
Due date:
% Done:

100%

Estimated time:
(Total: 0.00 h)
Story points:
0.5

Description

When Node Manager creates/updates an Arvados node record for a booting cloud node, save information about the node size's cloud cost in the node's properties field. Other Crunch components can use this to make cost-efficient scheduling decisions.

This is done in the corresponding methods of arvnodeman.computenode.dispatch.ComputeNodeSetupActor.


Subtasks

Task #7812: Record node priceResolvedTom Clegg

Task #7781: Review 7711-record-node-priceResolvedTom Clegg


Related issues

Related to Arvados - Story #7676: [Crunch] crunch-dispatch reserves the cheapest possible nodes for a jobResolved11/23/2015

Related to Arvados - Feature #10217: [Crunch1] Log the node properties at the start of a jobNew

Related to Arvados - Feature #10218: [Crunch2] Gather and record cloud/physical node information for each containerResolved03/20/2017

Associated revisions

Revision 055ac1cc
Added by Tom Clegg almost 4 years ago

Merge branch '7711-record-node-price' closes #7711

History

#1 Updated by Brett Smith almost 4 years ago

  • Description updated (diff)

#2 Updated by Brett Smith almost 4 years ago

  • Subject changed from [Node Manager] Record a node's cloud cost information in the Arvados node record's properties to [Node Manager] Record a node's cloud cost information in the Arvados node record's info

#3 Updated by Brett Smith almost 4 years ago

  • Target version changed from Arvados Future Sprints to 2015-12-02 sprint

#4 Updated by Brett Smith almost 4 years ago

  • Assigned To set to Tom Clegg

#5 Updated by Tom Clegg almost 4 years ago

  • Description updated (diff)
  • Subject changed from [Node Manager] Record a node's cloud cost information in the Arvados node record's info to [Node Manager] Record a node's cloud cost information in the Arvados node record's properties

#6 Updated by Tom Clegg almost 4 years ago

  • Status changed from New to In Progress

#7 Updated by Tom Clegg almost 4 years ago

7711-record-node-price @ d48dda4

#8 Updated by Brett Smith almost 4 years ago

Reviewing d48dda4. This is good to merge. My thoughts here are more philosophical, but there's nothing to block a merge.

  • What do you think about rolling this update into the Arvados Node API call we already do just before we create the cloud node, in create_arvados_node/prepare_arvados_node? My feeling on this is that remote API calls are the main opportunities for things to go wrong, and so the fewer of them we do, the better. And it seems like we have all the information we need at the time. But I realize there's a certain information purity in waiting until the cloud node is created.
  • I see the value in recording the size ID as well, for easier debugging and the use of other components. I worry about this a little bit, because I feel like Node Manager itself should get this information from libcloud whenever possible, and I'm worried a future hacker is going to see this code and use the information in the Arvados record because it's more discoverable. Maybe a comment to this effect in the update code would be nice?
  • In the places where you added self.make_mocks() in the tests, it looks to me like make_actor() is just buggy. When mocks aren't already made and it gets no arguments, it ends up calling self.make_mocks([None]), which I don't think ever makes sense. If you're feeling up for fixing that (changing the argument from [None] to just None), I think it would be a better long-term change, but no big deal if not.

Thanks.

#9 Updated by Tom Clegg almost 4 years ago

Brett Smith wrote:

Reviewing d48dda4. This is good to merge. My thoughts here are more philosophical, but there's nothing to block a merge.

  • What do you think about rolling this update into the Arvados Node API call we already do just before we create the cloud node, in create_arvados_node/prepare_arvados_node? My feeling on this is that remote API calls are the main opportunities for things to go wrong, and so the fewer of them we do, the better. And it seems like we have all the information we need at the time. But I realize there's a certain information purity in waiting until the cloud node is created.

I started out with that approach, but it was a bit verbose, and I figured we'll soon enough want to add some more information that we don't have until after create_cloud_node() so this seemed better structurally. I've added a comment to that effect, so that opportunity doesn't get optimized away unnoticed.

  • I see the value in recording the size ID as well, for easier debugging and the use of other components. I worry about this a little bit, because I feel like Node Manager itself should get this information from libcloud whenever possible, and I'm worried a future hacker is going to see this code and use the information in the Arvados record because it's more discoverable. Maybe a comment to this effect in the update code would be nice?

Yes. Added comment.

  • In the places where you added self.make_mocks() in the tests, it looks to me like make_actor() is just buggy. When mocks aren't already made and it gets no arguments, it ends up calling self.make_mocks([None]), which I don't think ever makes sense. If you're feeling up for fixing that (changing the argument from [None] to just None), I think it would be a better long-term change, but no big deal if not.

Yes, you're right. Fixed.

Now at 743341f

#10 Updated by Tom Clegg almost 4 years ago

  • Status changed from In Progress to Resolved

Applied in changeset arvados|commit:055ac1cc91121d3f27405d4f45455a8de5a0e57e.

Also available in: Atom PDF