Project

General

Profile

Actions

Bug #21540

open

occasional container_requests deadlock

Added by Peter Amstutz 2 months ago. Updated 2 months ago.

Status:
New
Priority:
Normal
Assigned To:
-
Category:
API
Target version:
Story points:
-

Description

arvados.cwl-runner: [container example] error submitting container

<HttpError 422 when requesting https://zzzzz.example.com/arvados/v1/container_requests/zzzzz-xvhdp-u047nzgnc3jdkd4?alt=json returned

"//railsapi.internal/arvados/v1/container_requests/zzzzz-xvhdp-u047nzgnc3jdkd4: 422 Unprocessable Entity: #<ActiveRecord::Deadlocked: PG::TRDeadlockDetected: ERROR: deadlock detected

DETAIL: Process 13792 waits for AccessExclusiveLock on tuple (801533,5) of relation 16562 of database 16400; blocked by process 31312.

Process 31312 waits for ShareLock on transaction 508183640; blocked by process 19685.

Process 19685 waits for ShareLock on transaction 508183565; blocked by process 13792.

HINT: See server log for query details. : select 1 from containers where containers.uuid in ( select pri_container_uuid from container_tree($1) UNION select container_requests.requesting_container_uuid from container_requests where container_requests.container_uuid = $1 and container_requests.state = 'Committed' and container_requests.requesting_container_uuid is not NULL ) order by containers.uuid for update > (req-4d3u0781f6rol0q7xux1)">

This is almost certainly a lock ordering issue. We should:

  1. Try to figure out what circumstances it happens and if it is fixable/avoidable
  2. Handle ActiveRecord::Deadlocked exceptions as 500 Internal Server Error so they are retried by the client

Related issues

Related to Arvados - Bug #21547: return certain database errors as 500 so they can be retriedNewPeter AmstutzActions
Actions #1

Updated by Peter Amstutz 2 months ago

  • Description updated (diff)
Actions #2

Updated by Peter Amstutz 2 months ago

  • Description updated (diff)
Actions #3

Updated by Peter Amstutz 2 months ago

  • Target version changed from Development 2024-03-27 sprint to Development 2024-03-13 sprint
Actions #4

Updated by Peter Amstutz 2 months ago

  • Target version changed from Development 2024-03-13 sprint to Future
  • Subject changed from occasional container_requests deadlock to convert
Actions #5

Updated by Peter Amstutz 2 months ago

  • Subject changed from convert to occasional container_requests deadlock
Actions #6

Updated by Peter Amstutz 2 months ago

  • Related to Bug #21547: return certain database errors as 500 so they can be retried added
Actions

Also available in: Atom PDF