Idea #7711
closed
[Node Manager] Record a node's cloud cost information in the Arvados node record's properties
Added by Brett Smith about 9 years ago.
Updated about 9 years ago.
Description
When Node Manager creates/updates an Arvados node record for a booting cloud node, save information about the node size's cloud cost in the node's properties field. Other Crunch components can use this to make cost-efficient scheduling decisions.
This is done in the corresponding methods of arvnodeman.computenode.dispatch.ComputeNodeSetupActor.
- Description updated (diff)
- Subject changed from [Node Manager] Record a node's cloud cost information in the Arvados node record's properties to [Node Manager] Record a node's cloud cost information in the Arvados node record's info
- Target version changed from Arvados Future Sprints to 2015-12-02 sprint
- Assigned To set to Tom Clegg
- Description updated (diff)
- Subject changed from [Node Manager] Record a node's cloud cost information in the Arvados node record's info to [Node Manager] Record a node's cloud cost information in the Arvados node record's properties
- Status changed from New to In Progress
Reviewing d48dda4. This is good to merge. My thoughts here are more philosophical, but there's nothing to block a merge.
- What do you think about rolling this update into the Arvados Node API call we already do just before we create the cloud node, in create_arvados_node/prepare_arvados_node? My feeling on this is that remote API calls are the main opportunities for things to go wrong, and so the fewer of them we do, the better. And it seems like we have all the information we need at the time. But I realize there's a certain information purity in waiting until the cloud node is created.
- I see the value in recording the size ID as well, for easier debugging and the use of other components. I worry about this a little bit, because I feel like Node Manager itself should get this information from libcloud whenever possible, and I'm worried a future hacker is going to see this code and use the information in the Arvados record because it's more discoverable. Maybe a comment to this effect in the update code would be nice?
- In the places where you added
self.make_mocks()
in the tests, it looks to me like make_actor()
is just buggy. When mocks aren't already made and it gets no arguments, it ends up calling self.make_mocks([None])
, which I don't think ever makes sense. If you're feeling up for fixing that (changing the argument from [None]
to just None
), I think it would be a better long-term change, but no big deal if not.
Thanks.
Brett Smith wrote:
Reviewing d48dda4. This is good to merge. My thoughts here are more philosophical, but there's nothing to block a merge.
- What do you think about rolling this update into the Arvados Node API call we already do just before we create the cloud node, in create_arvados_node/prepare_arvados_node? My feeling on this is that remote API calls are the main opportunities for things to go wrong, and so the fewer of them we do, the better. And it seems like we have all the information we need at the time. But I realize there's a certain information purity in waiting until the cloud node is created.
I started out with that approach, but it was a bit verbose, and I figured we'll soon enough want to add some more information that we don't have until after create_cloud_node() so this seemed better structurally. I've added a comment to that effect, so that opportunity doesn't get optimized away unnoticed.
- I see the value in recording the size ID as well, for easier debugging and the use of other components. I worry about this a little bit, because I feel like Node Manager itself should get this information from libcloud whenever possible, and I'm worried a future hacker is going to see this code and use the information in the Arvados record because it's more discoverable. Maybe a comment to this effect in the update code would be nice?
Yes. Added comment.
- In the places where you added
self.make_mocks()
in the tests, it looks to me like make_actor()
is just buggy. When mocks aren't already made and it gets no arguments, it ends up calling self.make_mocks([None])
, which I don't think ever makes sense. If you're feeling up for fixing that (changing the argument from [None]
to just None
), I think it would be a better long-term change, but no big deal if not.
Yes, you're right. Fixed.
Now at 743341f
- Status changed from In Progress to Resolved
Applied in changeset arvados|commit:055ac1cc91121d3f27405d4f45455a8de5a0e57e.
Also available in: Atom
PDF