Actions
Bug #4121
closed[Crunch] cancelled job did not get cancelled at the slurm level
Status:
Closed
Priority:
Normal
Assigned To:
-
Category:
-
Target version:
-
Start date:
Due date:
% Done:
0%
Estimated time:
Story points:
1.0
Updated by Ward Vandewege over 8 years ago
Job 9tee4-8i9sb-z5mxjnqgda5di0z was cancelled, but slurm never got the message:
squeue_long JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON) 87 compute 9tee4-8i9sb-z5mxjnqgda5di0z crunch R 16:22:31 1 compute1
Crunch-dispatch logs don't say a lot:
@4000000054331716018b706c.s:2014-10-06_21:58:52.06739 git --git-dir=/var/lib/arvados/internal.git tag 9tee4-8i9sb-z5mxjnqgda5di0z 3985ead6428cf6d847e107a4f449609a47b1f25b @4000000054331716018b706c.s:2014-10-06_21:58:52.14675 dispatch: sudo -E -u crunch PATH=/var/www/9tee4.arvadosapi.com/releases/20141006150429/vendor/bundle/ruby/2.1.0/bin:/usr/local/rvm/gems/ruby-2.1.2/bin:/usr/local/rvm/gems/ruby-2.1.2@global/bin:/usr/local/rvm/rubies/ruby-2.1.2/bin:/usr/local/rvm/bin:/usr/local/bin:/usr/local/sbin:/bin:/sbin:/usr/bin:/usr/sbin:/usr/X11R6/bin:/usr/local/arvados/src/services/crunch PERLLIB=/usr/local/arvados/src/sdk/perl/lib PYTHONPATH= RUBYLIB=/usr/local/rvm/gems/ruby-2.1.2@global/gems/bundler-1.6.2/lib GEM_PATH= salloc --chdir=/ --immediate --exclusive --no-kill --job-name=9tee4-8i9sb-z5mxjnqgda5di0z --nodelist=compute1 /usr/local/arvados/src/services/crunch/crunch-job --job-api-token 1rt2034l6xkz1k9mrb5kt13hjdyehklif12f5um8mr5x9e6oyc --job 9tee4-8i9sb-z5mxjnqgda5di0z --git-dir /var/lib/arvados/internal.git @4000000054331716018b706c.s:2014-10-06_21:58:52.22979 dispatch: job 9tee4-8i9sb-z5mxjnqgda5di0z @4000000054331716018b706c.s:2014-10-06_21:58:52.44741 dispatch: update compute1 state to {:state=>"alloc", :job=>"9tee4-8i9sb-z5mxjnqgda5di0z"} @4000000054331716018b706c.s:2014-10-06_21:58:52.52334 9tee4-8i9sb-z5mxjnqgda5di0z ! salloc: Granted job allocation 87 @4000000054331716018b706c.s:2014-10-06_21:58:53.00822 9tee4-8i9sb-z5mxjnqgda5di0z 23395 check slurm allocation @4000000054331716018b706c.s:2014-10-06_21:58:53.00838 9tee4-8i9sb-z5mxjnqgda5di0z 23395 node compute1 - 20 slots @4000000054331716018b706c.s:2014-10-06_21:58:53.24730 9tee4-8i9sb-z5mxjnqgda5di0z 23395 start @4000000054331716018b706c.s:2014-10-06_21:58:53.48397 9tee4-8i9sb-z5mxjnqgda5di0z 23395 Install revision 3985ead6428cf6d847e107a4f449609a47b1f25b @4000000054331716018b706c.s:2014-10-06_21:58:53.48406 9tee4-8i9sb-z5mxjnqgda5di0z ! /bin/fusermount: entry for /tmp/crunch-job/work/0.12178.keep not found in /etc/mtab @4000000054331716018b706c.s:2014-10-06_21:58:53.48419 9tee4-8i9sb-z5mxjnqgda5di0z ! /bin/fusermount: entry for /tmp/crunch-job/work/11.20175.keep not found in /etc/mtab @4000000054331716018b706c.s:2014-10-06_21:58:53.48425 9tee4-8i9sb-z5mxjnqgda5di0z ! /bin/fusermount: entry for /tmp/crunch-job/work/12.20190.keep not found in /etc/mtab @4000000054331716018b706c.s:2014-10-06_21:58:53.72786 9tee4-8i9sb-z5mxjnqgda5di0z ! /bin/fusermount: entry for /tmp/crunch-job/work/14.20214.keep not found in /etc/mtab @4000000054331716018b706c.s:2014-10-06_21:58:53.75874 9tee4-8i9sb-z5mxjnqgda5di0z ! /bin/fusermount: entry for /tmp/crunch-job/work/18.20264.keep not found in /etc/mtab @4000000054331716018b706c.s:2014-10-06_21:58:53.98341 9tee4-8i9sb-z5mxjnqgda5di0z ! /bin/fusermount: entry for /tmp/crunch-job/work/7.20114.keep not found in /etc/mtab @4000000054331716018b706c.s:2014-10-06_21:58:53.98349 9tee4-8i9sb-z5mxjnqgda5di0z ! /bin/fusermount: entry for /tmp/crunch-job/work/8.20128.keep not found in /etc/mtab
Updated by Ward Vandewege over 8 years ago
- Target version changed from Bug Triage to Arvados Future Sprints
Updated by Tom Morris over 6 years ago
- Target version deleted (
Arvados Future Sprints)
Actions