Bug #20447: Container table lock contention - Arvados

Actions

Copy link

Bug #20447

closed

Container table lock contention

Added by Peter Amstutz almost 2 years ago. Updated almost 2 years ago.

Status:

Resolved

Priority:

Normal

Assigned To:

Tom Clegg

Category:

API

Target version:

Development 2023-05-10 sprint

Story points:

Release:

Arvados 2.6.2

Release relationship:

Auto

Description

I need to look at postgres status to see what is going on, but I have a theory:

We put a "big lock" around the containers table, all write operations have to take an exclusive lock on the table (unfortunately this includes container operations that don't affect priorities, but maybe it's possible to make this) (#20240)
This means all container operations now have to wait to get the lock
We also added a feature whereby each time a "running containers probe" happens, it updates the "cost" on the API server (#19967)
This means write operations on containers are now happening much much more frequently than just when containers change state
As a result, requests involving containers are forced to wait in line, filling up the request queue and making everything slow.

On the plus side, the behavior of the dispatcher to back off when it sees 500 errors seems to be successfully keeping the system load from spiraling out of control.

This also suggests a short term fix for system load is to increase ProbeInterval.

Update:

Some supporting evidence:

After Lucas adjusted ProbeInterval this morning, the concurrent requests are down.
I was able to connect to the database and look at active queries. After changing ProbeInterval it is still the case that about 30%-40% of pending queries are "LOCK TABLE containers IN EXCLUSIVE mode"

Subtasks 1 (0 open — 1 closed)

Actions

Copy link

Also available in: Atom PDF

Project

General

Profile

Arvados

Custom queries

Bug #20447

Container table lock contention

Updated by Peter Amstutz almost 2 years ago

Updated by Peter Amstutz almost 2 years ago

Updated by Peter Amstutz almost 2 years ago

Updated by Peter Amstutz almost 2 years ago

Updated by Brett Smith almost 2 years ago

Updated by Peter Amstutz almost 2 years ago

Updated by Peter Amstutz almost 2 years ago

Updated by Peter Amstutz almost 2 years ago

Updated by Peter Amstutz almost 2 years ago

Updated by Peter Amstutz almost 2 years ago

Updated by Tom Clegg almost 2 years ago

Updated by Tom Clegg almost 2 years ago

Updated by Peter Amstutz almost 2 years ago

Updated by Tom Clegg almost 2 years ago

Updated by Peter Amstutz almost 2 years ago

Updated by Tom Clegg almost 2 years ago