Bug #20858
Updated by Peter Amstutz about 1 year ago
Observed on user cluster, the query used by @http_to_keep@ to search for a URL to see if it was previously imported into Keep or not, takes 10s - 20s to respond. It ought to be answered by an index and only take 100ms. Specifically: @r1 r1 = api.collections().list(filters=[["properties", "exists", url]]).execute()@ <pre> arvados_development=# explain analyze select uuid, properties from collections where jsonb_exists(properties, 'type'); QUERY PLAN ------------------------------------------------------------------------------------------------------------------- Seq Scan on collections (cost=0.00..5616.45 rows=14732 width=112) (actual time=0.056..54.770 rows=33740 loops=1) Filter: jsonb_exists(properties, 'type'::text) Rows Removed by Filter: 10437 Planning Time: 0.249 ms Execution Time: 56.583 ms (5 rows) arvados_development=# explain analyze select uuid, properties from collections where properties ? 'type'; QUERY PLAN ----------------------------------------------------------------------------------------------------------------------------------------------- Bitmap Heap Scan on collections (cost=12.34..176.59 rows=44 width=112) (actual time=12.918..43.863 rows=33740 loops=1) Recheck Cond: (properties ? 'type'::text) Heap Blocks: exact=4571 -> Bitmap Index Scan on collection_index_on_properties (cost=0.00..12.33 rows=44 width=0) (actual time=10.725..10.726 rows=33759 loops=1) Index Cond: (properties ? 'type'::text) Planning Time: 0.252 ms Execution Time: 46.099 ms (7 rows) </pre> Apparently @?@ and @jsonb_exists()@ don't get the same treatment from the query optimizer: https://dba.stackexchange.com/questions/90002/postgresql-operator-uses-index-but-underlying-function-does-not We have the same problem, where using @?@ conflicts with the parameter replacement. Fortunately, this stack overflow ticket also provides a workaround by defining a postgres function that uses the operator, and can be inlined. <pre> arvados_development=# CREATE OR REPLACE FUNCTION jb_contains(jsonb, text) arvados_development-# RETURNS bool AS arvados_development-# 'SELECT $1 ? $2' LANGUAGE sql IMMUTABLE; CREATE FUNCTION arvados_development=# explain analyze select uuid, properties from collections where jb_contains(properties, 'type'); QUERY PLAN ----------------------------------------------------------------------------------------------------------------------------------------------- Bitmap Heap Scan on collections (cost=12.34..176.59 rows=44 width=112) (actual time=17.727..44.716 rows=33740 loops=1) Recheck Cond: (properties ? 'type'::text) Heap Blocks: exact=4571 -> Bitmap Index Scan on collection_index_on_properties (cost=0.00..12.33 rows=44 width=0) (actual time=14.972..14.972 rows=33759 loops=1) Index Cond: (properties ? 'type'::text) Planning Time: 0.276 ms Execution Time: 46.536 ms (7 rows) </pre> url]]).execute()