Project

General

Profile

Actions

Bug #10808

closed

[Crunch] stuck job and pipeline instances on c97qk

Added by Ward Vandewege about 7 years ago. Updated about 7 years ago.

Status:
Resolved
Priority:
Normal
Assigned To:
Category:
-
Target version:
Story points:
-

Description

Job c97qk-8i9sb-bj9c3ojdng85osz appears to be unkillable via the cancel button on workbench.

There are several pipeline instances waiting on it:

2017-01-04_18:27:11.93074 2017-01-04 18:27:11 +0000 -- pipeline_instance c97qk-d1hrv-n6pik83zizjk5hn
2017-01-04_18:27:11.93074 cwl-runner c97qk-8i9sb-bj9c3ojdng85osz {:running=>1, :done=>0, :failed=>0, :todo=>0}
2017-01-04_18:27:13.03719 
2017-01-04_18:27:13.03721 2017-01-04 18:27:12 +0000 -- pipeline_instance c97qk-d1hrv-0thxn81rmpaedyo
2017-01-04_18:27:13.03721 cwl-runner c97qk-8i9sb-bj9c3ojdng85osz {:running=>1, :done=>0, :failed=>0, :todo=>0}
2017-01-04_18:27:14.53057 
2017-01-04_18:27:14.53060 2017-01-04 18:27:14 +0000 -- pipeline_instance c97qk-d1hrv-frf2e4vls4gq22v
2017-01-04_18:27:14.53062 cwl-runner c97qk-8i9sb-bj9c3ojdng85osz {:running=>1, :done=>0, :failed=>0, :todo=>0}
2017-01-04_18:27:15.74509 
2017-01-04_18:27:15.74511 2017-01-04 18:27:15 +0000 -- pipeline_instance c97qk-d1hrv-5dzt55sa9wlq495
2017-01-04_18:27:15.74512 cwl-runner c97qk-8i9sb-bj9c3ojdng85osz {:running=>1, :done=>0, :failed=>0, :todo=>0}
2017-01-04_18:27:16.69833 
2017-01-04_18:27:16.69834 2017-01-04 18:27:16 +0000 -- pipeline_instance c97qk-d1hrv-1uwxdzktqgl8hr6
2017-01-04_18:27:16.69835 cwl-runner c97qk-8i9sb-bj9c3ojdng85osz {:running=>1, :done=>0, :failed=>0, :todo=>0}
2017-01-04_18:27:20.34010 

It is not actually running:

c97qk:/etc/service# sinfo
PARTITION AVAIL  TIMELIMIT  NODES  STATE NODELIST
compute*     up   infinite      7 drain* compute[3-9]
compute*     up   infinite    249  down* compute[0-2,10-255]
crypto       up   infinite      7 drain* compute[3-9]
crypto       up   infinite    249  down* compute[0-2,10-255]
c97qk:/etc/service# squeue_long 
  JOBID PARTITION NAME     USER ST       TIME  NODES NODELIST(REASON)

I ran the stale jobs script which cleaned up two stale jobs but not c97qk-8i9sb-bj9c3ojdng85osz:

c97qk:/var/www/arvados-api/current/script# RAILS_ENV=production bundle exec ./fail-jobs.rb --before reboot
Called 'load' without the :safe option -- defaulting to safe mode.
You can avoid this warning in the future by setting the SafeYAML::OPTIONS[:default_mode] option (to :safe or :unsafe).
dispatch: c97qk-8i9sb-ip0ts72w6z9nty9: cleaned up stale job: started before 2016-11-18 02:31:23 +0000
dispatch: c97qk-8i9sb-7wzv76a2p7jbi9d: cleaned up stale job: started before 2016-11-18 02:31:23 +0000

The workbench logs show this when I press the cancel button:

#<ArvadosApiClient::AccessForbiddenException: #<ArvadosModel::PermissionDeniedError: ArvadosModel::PermissionDeniedError> [API: 403]>
/var/www/arvados-workbench/current/app/models/arvados_api_client.rb:171:in `api'
/var/www/arvados-workbench/current/app/models/job.rb:30:in `cancel'
/var/www/arvados-workbench/current/app/controllers/jobs_controller.rb:50:in `cancel'
/var/www/arvados-workbench/shared/vendor_bundle/ruby/2.3.0/gems/actionpack-4.1.12/lib/action_controller/metal/implicit_render.rb:4:in `send_action'
/var/www/arvados-workbench/shared/vendor_bundle/ruby/2.3.0/gems/actionpack-4.1.12/lib/abstract_controller/base.rb:189:in `process_action'
/var/www/arvados-workbench/shared/vendor_bundle/ruby/2.3.0/gems/actionpack-4.1.12/lib/action_controller/metal/rendering.rb:10:in `process_action'
/var/www/arvados-workbench/shared/vendor_bundle/ruby/2.3.0/gems/actionpack-4.1.12/lib/abstract_controller/callbacks.rb:20:in `block in process_action'
/var/www/arvados-workbench/shared/vendor_bundle/ruby/2.3.0/gems/activesupport-4.1.12/lib/active_support/callbacks.rb:113:in `call'
/var/www/arvados-workbench/shared/vendor_bundle/ruby/2.3.0/gems/activesupport-4.1.12/lib/active_support/callbacks.rb:552:in `block (2 levels) in compile'
/var/www/arvados-workbench/shared/vendor_bundle/ruby/2.3.0/gems/activesupport-4.1.12/lib/active_support/callbacks.rb:502:in `call'
/var/www/arvados-workbench/shared/vendor_bundle/ruby/2.3.0/gems/activesupport-4.1.12/lib/active_support/callbacks.rb:495:in `block (2 levels) in around'
/var/www/arvados-workbench/shared/vendor_bundle/ruby/2.3.0/gems/activesupport-4.1.12/lib/active_support/callbacks.rb:286:in `block (2 levels) in halting_and_conditional'
/var/www/arvados-workbench/current/app/controllers/application_controller.rb:649:in `require_thread_api_token'
/var/www/arvados-workbench/shared/vendor_bundle/ruby/2.3.0/gems/activesupport-4.1.12/lib/active_support/callbacks.rb:429:in `block in make_lambda'
/var/www/arvados-workbench/shared/vendor_bundle/ruby/2.3.0/gems/activesupport-4.1.12/lib/active_support/callbacks.rb:285:in `block in halting_and_conditional'
/var/www/arvados-workbench/shared/vendor_bundle/ruby/2.3.0/gems/activesupport-4.1.12/lib/active_support/callbacks.rb:494:in `block in around'
/var/www/arvados-workbench/shared/vendor_bundle/ruby/2.3.0/gems/activesupport-4.1.12/lib/active_support/callbacks.rb:502:in `call'
/var/www/arvados-workbench/shared/vendor_bundle/ruby/2.3.0/gems/activesupport-4.1.12/lib/active_support/callbacks.rb:495:in `block (2 levels) in around'
/var/www/arvados-workbench/shared/vendor_bundle/ruby/2.3.0/gems/activesupport-4.1.12/lib/active_support/callbacks.rb:306:in `block (2 levels) in halting'
/var/www/arvados-workbench/current/app/controllers/application_controller.rb:631:in `set_thread_api_token'
/var/www/arvados-workbench/shared/vendor_bundle/ruby/2.3.0/gems/activesupport-4.1.12/lib/active_support/callbacks.rb:429:in `block in make_lambda'
/var/www/arvados-workbench/shared/vendor_bundle/ruby/2.3.0/gems/activesupport-4.1.12/lib/active_support/callbacks.rb:305:in `block in halting'
/var/www/arvados-workbench/shared/vendor_bundle/ruby/2.3.0/gems/activesupport-4.1.12/lib/active_support/callbacks.rb:494:in `block in around'
/var/www/arvados-workbench/shared/vendor_bundle/ruby/2.3.0/gems/activesupport-4.1.12/lib/active_support/callbacks.rb:502:in `call'
/var/www/arvados-workbench/shared/vendor_bundle/ruby/2.3.0/gems/activesupport-4.1.12/lib/active_support/callbacks.rb:495:in `block (2 levels) in around'
/var/www/arvados-workbench/shared/vendor_bundle/ruby/2.3.0/gems/activesupport-4.1.12/lib/active_support/callbacks.rb:306:in `block (2 levels) in halting'
/var/www/arvados-workbench/current/app/controllers/application_controller.rb:561:in `thread_clear'
/var/www/arvados-workbench/shared/vendor_bundle/ruby/2.3.0/gems/activesupport-4.1.12/lib/active_support/callbacks.rb:429:in `block in make_lambda'
/var/www/arvados-workbench/shared/vendor_bundle/ruby/2.3.0/gems/activesupport-4.1.12/lib/active_support/callbacks.rb:305:in `block in halting'
/var/www/arvados-workbench/shared/vendor_bundle/ruby/2.3.0/gems/activesupport-4.1.12/lib/active_support/callbacks.rb:494:in `block in around'
/var/www/arvados-workbench/shared/vendor_bundle/ruby/2.3.0/gems/activesupport-4.1.12/lib/active_support/callbacks.rb:502:in `call'
/var/www/arvados-workbench/shared/vendor_bundle/ruby/2.3.0/gems/activesupport-4.1.12/lib/active_support/callbacks.rb:86:in `run_callbacks'
/var/www/arvados-workbench/shared/vendor_bundle/ruby/2.3.0/gems/actionpack-4.1.12/lib/abstract_controller/callbacks.rb:19:in `process_action'
/var/www/arvados-workbench/shared/vendor_bundle/ruby/2.3.0/gems/actionpack-4.1.12/lib/action_controller/metal/rescue.rb:29:in `process_action'
/var/www/arvados-workbench/shared/vendor_bundle/ruby/2.3.0/gems/actionpack-4.1.12/lib/action_controller/metal/instrumentation.rb:32:in `block in process_action'
/var/www/arvados-workbench/shared/vendor_bundle/ruby/2.3.0/gems/activesupport-4.1.12/lib/active_support/notifications.rb:159:in `block in instrument'
/var/www/arvados-workbench/shared/vendor_bundle/ruby/2.3.0/gems/activesupport-4.1.12/lib/active_support/notifications/instrumenter.rb:20:in `instrument'
/var/www/arvados-workbench/shared/vendor_bundle/ruby/2.3.0/gems/activesupport-4.1.12/lib/active_support/notifications.rb:159:in `instrument'
/var/www/arvados-workbench/shared/vendor_bundle/ruby/2.3.0/gems/actionpack-4.1.12/lib/action_controller/metal/instrumentation.rb:30:in `process_action'
/var/www/arvados-workbench/shared/vendor_bundle/ruby/2.3.0/gems/actionpack-4.1.12/lib/action_controller/metal/params_wrapper.rb:250:in `process_action'
/var/www/arvados-workbench/shared/vendor_bundle/ruby/2.3.0/gems/activerecord-4.1.12/lib/active_record/railties/controller_runtime.rb:18:in `process_action'
/var/www/arvados-workbench/shared/vendor_bundle/ruby/2.3.0/gems/actionpack-4.1.12/lib/abstract_controller/base.rb:136:in `process'
/var/www/arvados-workbench/shared/vendor_bundle/ruby/2.3.0/gems/actionview-4.1.12/lib/action_view/rendering.rb:30:in `process'
/var/www/arvados-workbench/shared/vendor_bundle/ruby/2.3.0/gems/actionpack-4.1.12/lib/action_controller/metal.rb:196:in `dispatch'
/var/www/arvados-workbench/shared/vendor_bundle/ruby/2.3.0/gems/actionpack-4.1.12/lib/action_controller/metal/rack_delegation.rb:13:in `dispatch'
/var/www/arvados-workbench/shared/vendor_bundle/ruby/2.3.0/gems/actionpack-4.1.12/lib/action_controller/metal.rb:232:in `block in action'
/var/www/arvados-workbench/shared/vendor_bundle/ruby/2.3.0/gems/actionpack-4.1.12/lib/action_dispatch/routing/route_set.rb:82:in `dispatch'
/var/www/arvados-workbench/shared/vendor_bundle/ruby/2.3.0/gems/actionpack-4.1.12/lib/action_dispatch/routing/route_set.rb:50:in `call'
/var/www/arvados-workbench/shared/vendor_bundle/ruby/2.3.0/gems/actionpack-4.1.12/lib/action_dispatch/journey/router.rb:73:in `block in call'
/var/www/arvados-workbench/shared/vendor_bundle/ruby/2.3.0/gems/actionpack-4.1.12/lib/action_dispatch/journey/router.rb:59:in `each'
/var/www/arvados-workbench/shared/vendor_bundle/ruby/2.3.0/gems/actionpack-4.1.12/lib/action_dispatch/journey/router.rb:59:in `call'
/var/www/arvados-workbench/shared/vendor_bundle/ruby/2.3.0/gems/actionpack-4.1.12/lib/action_dispatch/routing/route_set.rb:692:in `call'
/var/www/arvados-workbench/shared/vendor_bundle/ruby/2.3.0/gems/rack-1.5.5/lib/rack/etag.rb:23:in `call'
/var/www/arvados-workbench/shared/vendor_bundle/ruby/2.3.0/gems/rack-1.5.5/lib/rack/conditionalget.rb:35:in `call'
/var/www/arvados-workbench/shared/vendor_bundle/ruby/2.3.0/gems/rack-1.5.5/lib/rack/head.rb:11:in `call'
/var/www/arvados-workbench/shared/vendor_bundle/ruby/2.3.0/gems/actionpack-4.1.12/lib/action_dispatch/middleware/params_parser.rb:27:in `call'
/var/www/arvados-workbench/shared/vendor_bundle/ruby/2.3.0/gems/actionpack-4.1.12/lib/action_dispatch/middleware/flash.rb:254:in `call'
/var/www/arvados-workbench/shared/vendor_bundle/ruby/2.3.0/gems/rack-1.5.5/lib/rack/session/abstract/id.rb:225:in `context'
/var/www/arvados-workbench/shared/vendor_bundle/ruby/2.3.0/gems/rack-1.5.5/lib/rack/session/abstract/id.rb:220:in `call'
/var/www/arvados-workbench/shared/vendor_bundle/ruby/2.3.0/gems/actionpack-4.1.12/lib/action_dispatch/middleware/cookies.rb:562:in `call'
/var/www/arvados-workbench/shared/vendor_bundle/ruby/2.3.0/gems/activerecord-4.1.12/lib/active_record/query_cache.rb:36:in `call'
/var/www/arvados-workbench/shared/vendor_bundle/ruby/2.3.0/gems/activerecord-4.1.12/lib/active_record/connection_adapters/abstract/connection_pool.rb:621:in `call'
/var/www/arvados-workbench/shared/vendor_bundle/ruby/2.3.0/gems/actionpack-4.1.12/lib/action_dispatch/middleware/callbacks.rb:29:in `block in call'
/var/www/arvados-workbench/shared/vendor_bundle/ruby/2.3.0/gems/activesupport-4.1.12/lib/active_support/callbacks.rb:82:in `run_callbacks'
/var/www/arvados-workbench/shared/vendor_bundle/ruby/2.3.0/gems/actionpack-4.1.12/lib/action_dispatch/middleware/callbacks.rb:27:in `call'
/var/www/arvados-workbench/shared/vendor_bundle/ruby/2.3.0/gems/actionpack-4.1.12/lib/action_dispatch/middleware/remote_ip.rb:76:in `call'
/var/www/arvados-workbench/shared/vendor_bundle/ruby/2.3.0/gems/actionpack-4.1.12/lib/action_dispatch/middleware/debug_exceptions.rb:17:in `call'
/var/www/arvados-workbench/shared/vendor_bundle/ruby/2.3.0/gems/actionpack-4.1.12/lib/action_dispatch/middleware/show_exceptions.rb:30:in `call'
/var/www/arvados-workbench/shared/vendor_bundle/ruby/2.3.0/gems/lograge-0.3.6/lib/lograge/rails_ext/rack/logger.rb:15:in `call_app'
/var/www/arvados-workbench/shared/vendor_bundle/ruby/2.3.0/gems/railties-4.1.12/lib/rails/rack/logger.rb:20:in `block in call'
/var/www/arvados-workbench/shared/vendor_bundle/ruby/2.3.0/gems/activesupport-4.1.12/lib/active_support/tagged_logging.rb:68:in `block in tagged'
/var/www/arvados-workbench/shared/vendor_bundle/ruby/2.3.0/gems/activesupport-4.1.12/lib/active_support/tagged_logging.rb:26:in `tagged'
/var/www/arvados-workbench/shared/vendor_bundle/ruby/2.3.0/gems/activesupport-4.1.12/lib/active_support/tagged_logging.rb:68:in `tagged'
/var/www/arvados-workbench/shared/vendor_bundle/ruby/2.3.0/gems/railties-4.1.12/lib/rails/rack/logger.rb:20:in `call'
/var/www/arvados-workbench/shared/vendor_bundle/ruby/2.3.0/gems/actionpack-4.1.12/lib/action_dispatch/middleware/request_id.rb:21:in `call'
/var/www/arvados-workbench/shared/vendor_bundle/ruby/2.3.0/gems/rack-1.5.5/lib/rack/methodoverride.rb:21:in `call'
/var/www/arvados-workbench/shared/vendor_bundle/ruby/2.3.0/gems/rack-1.5.5/lib/rack/runtime.rb:17:in `call'
/var/www/arvados-workbench/shared/vendor_bundle/ruby/2.3.0/gems/activesupport-4.1.12/lib/active_support/cache/strategy/local_cache_middleware.rb:26:in `call'
/var/www/arvados-workbench/shared/vendor_bundle/ruby/2.3.0/gems/rack-1.5.5/lib/rack/sendfile.rb:112:in `call'
/var/www/arvados-workbench/shared/vendor_bundle/ruby/2.3.0/gems/actionpack-4.1.12/lib/action_dispatch/middleware/ssl.rb:24:in `call'
/var/www/arvados-workbench/shared/vendor_bundle/ruby/2.3.0/gems/railties-4.1.12/lib/rails/engine.rb:514:in `call'
/var/www/arvados-workbench/shared/vendor_bundle/ruby/2.3.0/gems/railties-4.1.12/lib/rails/application.rb:144:in `call'
/var/www/arvados-workbench/shared/vendor_bundle/ruby/2.3.0/gems/railties-4.1.12/lib/rails/railtie.rb:194:in `public_send'
/var/www/arvados-workbench/shared/vendor_bundle/ruby/2.3.0/gems/railties-4.1.12/lib/rails/railtie.rb:194:in `method_missing'
/usr/lib/ruby/vendor_ruby/phusion_passenger/rack/thread_handler_extension.rb:94:in `process_request'
/usr/lib/ruby/vendor_ruby/phusion_passenger/request_handler/thread_handler.rb:157:in `accept_and_process_next_request'
/usr/lib/ruby/vendor_ruby/phusion_passenger/request_handler/thread_handler.rb:110:in `main_loop'
/usr/lib/ruby/vendor_ruby/phusion_passenger/request_handler.rb:415:in `block (3 levels) in start_threads'
/usr/lib/ruby/vendor_ruby/phusion_passenger/utils.rb:111:in `block in create_thread_and_abort_on_exception'

Subtasks 1 (0 open1 closed)

Task #10828: ReviewResolvedRadhika Chippada01/05/2017Actions
Actions

Also available in: Atom PDF