Actions
Bug #15902
closedVery slow query for large workflows
Story points:
-
Release:
Release relationship:
Auto
Description
A customer reports that the query below takes 32 seconds for a ~3,000 container workflow:
duration: 32570.970 ms execute <unnamed>: SELECT container_requests."uuid", container_requests."owner_uuid", container_requests."created_at", container_requests."modified_at", container_requests."modified_by_client_uuid", container_requests."modified_by_user_uuid", container_requests."name", container_requests."description", container_requests."properties", container_requests."state", container_requests."requesting_container_uuid", container_requests."container_uuid", container_requests."container_count_max", container_requests."runtime_constraints", container_requests."container_image", container_requests."environment", container_requests."cwd", container_requests."command", container_requests."output_path", container_requests."priority", container_requests."expires_at", container_requests."filters", container_requests."container_count", container_requests."use_existing", container_requests."scheduling_parameters", container_requests."output_uuid", container_requests."log_uuid", container_requests."output_name", container_requests."output_ttl"
FROM "container_requests"
WHERE (container_requests.owner_uuid NOT IN (SELECT target_uuid FROM materialized_permission_view WHERE trashed = 1) ) AND (container_requests.requesting_container_uuid in ('zzzzz-dz642-r9n91uyk20xowyf',...[~3000 UUIDs omitted]
) ORDER BY container_requests.modified_at desc, container_requests.uuid LIMIT $1 OFFSET $2
< 2019-11-26 10:50:27.162 EST >DETAIL: parameters: $1 = '1000', $2 = '0'
Here's the EXPLAIN ANALYZE for the original query:
Limit (cost=2385.09..16840.05 rows=1000 width=791) (actual time=32385.658..32385.658 rows=0 loops=1)
-> Index Scan using index_container_requests_on_modified_at_uuid on container_requests (cost=2385.09..1995593.21 rows=137891 width=791) (actual time=32385.656..32385.656 rows=0 loops=1)
Filter: ((NOT (hashed SubPlan 1)) AND ((requesting_container_uuid)::text = ANY ('{zzzzz-dz642-r9n91uyk20xowyf,...}'::text[])))
Rows Removed by Filter: 376597
SubPlan 1
-> Index Only Scan using permission_target_trashed on materialized_permission_view (cost=0.42..2380.40 rows=1709 width=28) (actual time=0.050..0.351 rows=95 loops=1)
Index Cond: (trashed = 1)
Heap Fetches: 95
Planning time: 29.160 ms
Execution time: 32385.751 ms
Deleting the index that the query planner incorrectly chose, index_container_requests_on_modified_at_uuid, dropped the query time from 30+ seconds to ~50 msec.
Limit (cost=83535.57..83538.07 rows=1000 width=780) (actual time=54.419..54.419 rows=0 loops=1)
-> Sort (cost=83535.57..83877.22 rows=136657 width=780) (actual time=54.417..54.417 rows=0 loops=1)
Sort Key: container_requests.modified_at DESC, container_requests.uuid
Sort Method: quicksort Memory: 25kB
-> Index Scan using index_container_requests_on_requesting_container_uuid on container_requests (cost=635.83..76042.82 rows=136657 width=780) (actual time=54.323..54.323 rows=0 loops=1)
Before this was done, a full VACUUM ANALYZE had no affect on performance, but dropping the ORDER BY clause from the query also dramatically improved the query times.
One thing that seems strange is that the query is ordering by not only modified_at
, but also UUID which doesn't really seem to add any value.
We should see if we can adjust the indexes and/or query in a way which the query planner can optimize correctly.
Actions