Bug #8084
closed
[Node Manager] Doesn't convert units when determining if a job's scratch request is satisfiable
Added by Brett Smith over 8 years ago.
Updated over 5 years ago.
Description
In libcloud, the disk attribute of a NodeSize is a number of gigabytes. Because our own scratch attribute is based on this by default, we've decided that it's also a number of gigabytes. See the comments about scratch in, e.g., doc/ec2.example.cfg
.
Crunch jobs can have a min_scratch_mb_per_node
runtime constraint. Node Manager looks at this constraint to determine satisfiability, but it compares this raw number of MB with the NodeSize's number of GB scratch without any units conversation.
Convert units as appropriate when determining satisfiability.
- Description updated (diff)
- Category set to Node Manager
- Story points set to 0.5
- Status changed from New to In Progress
- Assigned To set to Nico César
This is a temporary patch, I unblocked the work doing:
$ git diff 713a40d..75cf3cc
diff --git a/hieradata/manage.qr1hi.arvadosapi.com/common.yaml b/hieradata/manage.qr1hi.arvadosapi.com/common.yaml
index 0acb186..a9e2c0a 100644
--- a/hieradata/manage.qr1hi.arvadosapi.com/common.yaml
+++ b/hieradata/manage.qr1hi.arvadosapi.com/common.yaml
@@ -71,22 +71,26 @@ arvados-node-manager:
ping_host: "%{hiera('api_hostname')}"
Cloud List:
ex_resource_group: "%{hiera('uuid_cluster')}"
+ ## 2015-12-29 nico
+ ## I changed scratch space to be represented in MB to overcome
+ ## issue #8084, and unlock Bryan's work
+ ## once this is done should be back to MB
Size Standard_D1_v2:
cores: 1
price: 0.074
- scratch: 50
+ scratch: 50000
Size Standard_D2_v2:
cores: 2
price: 0.149
- scratch: 100
+ scratch: 100000
Size Standard_D3_v2:
cores: 4
price: 0.297
- scratch: 200
+ scratch: 200000
Size Standard_D4_v2:
cores: 8
price: 0.595
- scratch: 400
+ scratch: 400000
# You can define any number of Size sections to list Azure sizes you're
# willing to use. The Node Manager should boot the cheapest size(s) that
# can run jobs in the queue (N.B.: defining more than one size has not been
this created the machines with:
/dev/mapper/tmp 200G 9.5G 191G 5% /tmp
none 200G 9.5G 191G 5% /tmp/docker/aufs/mnt/1491488de6638fe009eea00949de2412dce4cb5b21b9ebcb6ca9c5581a7728b7
and it's attached correctly .
[ 9.248445] sd 3:0:1:0: Attached scsi generic sg2 type 0
[ 9.265847] sd 3:0:1:0: [sdb] 419430400 512-byte logical blocks: (214 GB/200 GiB)
[ 9.286436] sd 3:0:1:0: [sdb] 4096-byte physical blocks
[ 9.325022] sd 3:0:1:0: [sdb] Write Protect is off
[ 9.340681] sd 3:0:1:0: [sdb] Mode Sense: 0f 00 10 00
[ 9.341887] sd 3:0:1:0: [sdb] Write cache: enabled, read cache: enabled, supports DPO and FUA
[ 9.382054] sdb: sdb1
- Target version set to Arvados Future Sprints
- Status changed from In Progress to Resolved
- Target version deleted (
Arvados Future Sprints)
Also available in: Atom
PDF