Actions
Bug #14026
closedError response from daemon: Range of CPUs is from 0.01 to 1.00, as there are only 1 CPUs available
Status:
Closed
Priority:
Normal
Assigned To:
-
Category:
-
Target version:
-
Story points:
-
Description
https://workbench.e51c5.arvadosapi.com/container_requests/e51c5-xvhdp-ugp7a5guqc4brhh#Log
2018-08-14T10:11:28.088035478Z crunch-run 1.1.4 started 2018-08-14T10:11:28.088079978Z Executing container 'e51c5-dz642-uc8ccn1nzs9q9u0' 2018-08-14T10:11:28.088104778Z Executing on host 'compute62.e51c5.arvadosapi.com' 2018-08-14T10:11:28.176213953Z Fetching Docker image from collection 'be17bc91682c86583461bf461858492b+426' 2018-08-14T10:11:28.188391149Z Using Docker image id 'sha256:d849cf08d27d02f19c5ae1ea0a5d49dc4b000e495a727762983342f7168a4199' 2018-08-14T10:11:28.190780848Z Docker image is available 2018-08-14T10:11:28.190948348Z Running [arv-mount --foreground --allow-other --read-write --crunchstat-interval=10 --file-cache 268435456 --mount-by-pdh by_id /tmp/crunch-run.e51c5-dz642-uc8ccn1nzs9q9u0.460952586/keep648490465] 2018-08-14T10:11:28.176360052Z notice: reading stats from /sys/fs/cgroup/cpuacct/cgroup.procs 2018-08-14T10:11:28.176402052Z notice: reading stats from /sys/fs/cgroup/memory/memory.stat 2018-08-14T10:11:28.176727852Z mem 20295680 cache 88 pgmajfault 536576 rss 2018-08-14T10:11:28.176742552Z notice: reading stats from /sys/fs/cgroup/cpuacct/cpuacct.stat 2018-08-14T10:11:28.176781152Z notice: reading stats from /sys/fs/cgroup/cpuset/cpuset.cpus 2018-08-14T10:11:28.176800152Z cpu 25.6000 user 22.7300 sys 1 cpus 2018-08-14T10:11:28.176925952Z net:docker0 0 tx 0 rx 2018-08-14T10:11:28.176936152Z net:eth0 1141280 tx 565449770 rx 2018-08-14T10:11:29.097092488Z Creating Docker container 2018-08-14T10:11:29.100320488Z While creating container: Error response from daemon: Range of CPUs is from 0.01 to 1.00, as there are only 1 CPUs available 2018-08-14T10:11:29.145217775Z Running [arv-mount --unmount-timeout=8 --unmount /tmp/crunch-run.e51c5-dz642-uc8ccn1nzs9q9u0.460952586/keep648490465] 2018-08-14T10:11:29.432344492Z fusermount: failed to unmount /tmp/crunch-run.e51c5-dz642-uc8ccn1nzs9q9u0.460952586/keep648490465: Invalid argument 2018-08-14T10:11:29.482877378Z crunch-run finished 2018-08-14T10:15:40.149987687Z crunch-run 1.1.4 started 2018-08-14T10:15:40.150028988Z Executing container 'e51c5-dz642-uc8ccn1nzs9q9u0' 2018-08-14T10:15:40.150051888Z Executing on host 'compute29.e51c5.arvadosapi.com' 2018-08-14T10:15:40.217954197Z Fetching Docker image from collection 'be17bc91682c86583461bf461858492b+426' 2018-08-14T10:15:40.231564039Z Using Docker image id 'sha256:d849cf08d27d02f19c5ae1ea0a5d49dc4b000e495a727762983342f7168a4199' 2018-08-14T10:15:40.233583245Z Docker image is available 2018-08-14T10:15:40.233748646Z Running [arv-mount --foreground --allow-other --read-write --crunchstat-interval=10 --file-cache 268435456 --mount-by-pdh by_id /tmp/crunch-run.e51c5-dz642-uc8ccn1nzs9q9u0.545151285/keep410107664] 2018-08-14T10:15:41.002139315Z Creating Docker container 2018-08-14T10:15:41.005905626Z While creating container: Error response from daemon: Range of CPUs is from 0.01 to 1.00, as there are only 1 CPUs available
Related issues
Updated by Peter Amstutz over 5 years ago
It appears to have made a scheduling error and put a job requesting 2 cores on a 1 core machine. Which probably means slurm was acting on bad information when deciding how to schedule the job.
Updated by Tom Clegg over 5 years ago
Log indicates we tried to run it on the wrong instance type 3 times before choosing one with 2 CPUs. compute62 and compute29 (x2) failed with that error, then compute68 succeeded. Unfortunately the host info logs aren't saved for the earlier attempts so I'm just assuming docker is counting the host's CPUs correctly.
Updated by Tom Clegg over 5 years ago
- Is duplicate of Bug #14036: a-n-m spawned a container on the wrong sized node added
Actions