Project

General

Profile

Actions

Bug #13630

open

locking api_client_authorizations for update can deadlock ContainersController#update

Added by Joshua Randall almost 6 years ago. Updated about 2 months ago.

Status:
New
Priority:
Normal
Assigned To:
-
Category:
API
Target version:
Story points:
-
Release:
Release relationship:
Auto

Description

We've observed an apparent deadlock (though not detected by postgres) when a containers update request (ContainersController#update ) calls assign_auth which results in an update_attributes call on api_client_authorizations.

When this happened, the database ended up with a number of queries of the form:

UPDATE "api_client_authorizations" SET "expires_at" = $1, "updated_at" = $2 WHERE "api_client_authorizations"."id" = $3

All of which were waiting on row and/or tuple exclusive locks (i.e. none of those queries were actively running - they were ALL waiting). We let that go for over 45m and saw every new query get stuck in the same situation before we manually killed all of the waiting postgres processes to clear them out.

Partial stack trace (from new relic) leading up to the stuck query:

…/active_record/mass_assignment_security/
persistence.rb:  65:in `block in update'
…/active_record/mass_assignment_security/
persistence.rb:  63:in `update'
…api-db/arvados-api/current/app/models/
arvados_model.rb: 125:in `update_attributes'
…dos-api-db/arvados-api/current/app/models/
container.rb: 492:in `assign_auth'
…model-4.2.10/lib/active_model/validations/
callbacks.rb: 113:in `run_validations!'
…ems/activemodel-4.2.10/lib/active_model/
validations.rb: 338:in `valid?'
…/active_record/mass_assignment_security/
persistence.rb:  77:in `block in update!'
…/active_record/mass_assignment_security/
persistence.rb:  75:in `update!'
…-api/current/app/controllers/
application_controller.rb: 119:in `update'
…nt/app/controllers/arvados/v1/
containers_controller.rb:  30:in `block in update'
…nt/app/controllers/arvados/v1/
containers_controller.rb:  27:in `update'
…-api/current/app/controllers/
application_controller.rb: 398:in `block in set_current_request_id'
…-api/current/app/controllers/
application_controller.rb: 397:in `set_current_request_id'
….0/gems/actionview-4.2.10/lib/action_view/
rendering.rb:  30:in `process'
…uby/2.3.0/gems/omniauth-1.4.3/lib/omniauth/
strategy.rb: 186:in `call!'
…uby/2.3.0/gems/omniauth-1.4.3/lib/omniauth/
strategy.rb: 164:in `call'
…ruby/2.3.0/gems/omniauth-1.4.3/lib/omniauth/
builder.rb:  63:in `call'
…or_bundle/ruby/2.3.0/gems/rack-1.6.10/lib/rack/
etag.rb:  24:in `call'
…ruby/2.3.0/gems/rack-1.6.10/lib/rack/
conditionalget.rb:  38:in `call'
…or_bundle/ruby/2.3.0/gems/rack-1.6.10/lib/rack/
head.rb:  13:in `call'
…2.3.0/gems/rack-1.6.10/lib/rack/session/abstract/
id.rb: 225:in `context'
…2.3.0/gems/rack-1.6.10/lib/rack/session/abstract/
id.rb: 220:in `call'
…ms/lograge-0.10.0/lib/lograge/rails_ext/rack/
logger.rb:  15:in `call_app'
…ms/request_store-1.4.1/lib/request_store/
middleware.rb:  19:in `call'
…ruby/2.3.0/gems/rack-1.6.10/lib/rack/
methodoverride.rb:  22:in `call'
…bundle/ruby/2.3.0/gems/rack-1.6.10/lib/rack/
runtime.rb:  18:in `call'
…undle/ruby/2.3.0/gems/rack-1.6.10/lib/rack/
sendfile.rb: 113:in `call'
…vados-api/current/app/middlewares/
arvados_api_token.rb:  63:in `call'
…uby/phusion_passenger/rack/
thread_handler_extension.rb:  97:in `process_request'
…by/phusion_passenger/request_handler/
thread_handler.rb: 157:in `accept_and_process_next_request'
…by/phusion_passenger/request_handler/
thread_handler.rb: 110:in `main_loop'
…/ruby/vendor_ruby/phusion_passenger/
request_handler.rb: 415:in `block (3 levels) in start_threads'
   /usr/lib/ruby/vendor_ruby/phusion_passenger/
utils.rb: 113:in `block in create_thread_and_abort_on_exception'

Actions #1

Updated by Joshua Randall almost 6 years ago

We have mitigated this issue by adding a timeout on all postgres statements - there is no sense in the API server waiting for postgres statements for longer than the request timeout!

We modified the `initialize` method in `models/arvados_model.rb` so that it looks like this:

  def initialize raw_params={}, *args
    ActiveRecord::Base.connection.execute('SET statement_timeout = 300000')
    super(self.class.permit_attribute_params(raw_params), *args)
  end

Actions #4

Updated by Peter Amstutz about 1 year ago

  • Release set to 60
Actions #5

Updated by Peter Amstutz about 2 months ago

  • Target version set to Future
Actions

Also available in: Atom PDF