Project

General

Profile

Actions

Bug #14983

closed

[API] searching for "test" on c97qk takes a long time

Added by Peter Amstutz about 5 years ago. Updated over 4 years ago.

Status:
Closed
Priority:
Normal
Assigned To:
-
Category:
-
Target version:
-
Story points:
-

Description

This query:

https://c97qk.arvadosapi.com/arvados/v1/groups/contents?limit=10&offset=0&include_trash=false&filters=[[%22any%22,%22@@%22,%22test%22],[%22uuid%22,%22is_a%22,[%22arvados%23group%22,%22arvados%23collection%22,%22arvados%23containerRequest%22]],[%22uuid%22,%22is_a%22,[%22arvados%23group%22,%22arvados%23containerRequest%22,%22arvados%23collection%22]]]&order=collections.last_modified+desc,container_requests.last_modified+desc,groups.last_modified+desc

takes 37.6 seconds to return.


Related issues

Related to Arvados - Bug #14560: [1.3.0] error: ERROR: string is too long for tsvector (2299194 bytes, max 1048575 bytes)ResolvedTom CleggActions
Actions #1

Updated by Peter Amstutz about 5 years ago

  • Status changed from New to In Progress
Actions #2

Updated by Tom Morris about 5 years ago

  • Description updated (diff)
Actions #3

Updated by Tom Clegg about 5 years ago

Looks like it's mostly waiting for postgres to figure out that 787 of 484K collections match.

Takes <1s if you add &count=none.

Actions #4

Updated by Tom Clegg about 5 years ago

  • Status changed from In Progress to New
Actions #5

Updated by Tom Clegg about 5 years ago

SELECT COUNT("collections"."id") FROM "collections" WHERE (collections.owner_uuid NOT IN (SELECT target_uuid FROM materialized_permission_view WHERE trashed = 1) AND collections.is_trashed = false AND collections.uuid = collections.current_version_uuid) AND ((to_tsvector('english', substr(coalesce(owner_uuid,'') || ' ' || coalesce(modified_by_client_uuid,'') || ' ' || coalesce(modified_by_user_uuid,'') || ' ' || coalesce(portable_data_hash,'') || ' ' || coalesce(uuid,'') || ' ' || coalesce(name,'') || ' ' || coalesce(description,'') || ' ' || coalesce(properties::text,'') || ' ' || coalesce(file_names,''), 0, 8000)) @@ to_tsquery('test')) AND (1=0 OR 1=1 OR 1=0) AND (1=0 OR 1=0 OR 1=1))
Actions #6

Updated by Tom Clegg about 5 years ago

explain analyze SELECT COUNT("collections"."id") FROM "collections" WHERE (collections.owner_uuid NOT IN (SELECT target_uuid FROM materialized_permission_view WHERE trashed = 1) AND collections.is_trashed = false AND collections.uuid = collections.current_version_uuid) AND ((to_tsvector('english', substr(coalesce(owner_uuid,'') || ' ' || coalesce(modified_by_client_uuid,'') || ' ' || coalesce(modified_by_user_uuid,'') || ' ' || coalesce(portable_data_hash,'') || ' ' || coalesce(uuid,'') || ' ' || coalesce(name,'') || ' ' || coalesce(description,'') || ' ' || coalesce(properties::text,'') || ' ' || coalesce(file_names,''), 0, 8000)) @@ to_tsquery('test')) AND (1=0 OR 1=1 OR 1=0) AND (1=0 OR 1=0 OR 1=1))

 Aggregate  (cost=5510.48..5510.49 rows=1 width=4) (actual time=24189.179..24189.179 rows=1 loops=1)
   ->  Index Scan using index_collections_on_owner_uuid_and_name on collections  (cost=6.47..5510.46 rows=5 width=4) (actual time=45.039..24189.151 rows=32 loops=1)
         Filter: ((NOT (hashed SubPlan 1)) AND (to_tsvector('english'::regconfig, substr((((((((((((((((((COALESCE(owner_uuid, ''::character varying))::text || ' '::text) || (COALESCE(modified_by_client_uuid, ''::character varying))::text) || ' '::text) || (COALESCE(modified_by_user_uuid, ''::character varying))::text) || ' '::text) || (COALESCE(portable_data_hash, ''::character varying))::text) || ' '::text) || (COALESCE(uuid, ''::character varying))::text) || ' '::text) || (COALESCE(name, ''::character varying))::text) || ' '::text) || (COALESCE(description, ''::character varying))::text) || ' '::text) || COALESCE((properties)::text, ''::text)) || ' '::text) || COALESCE(file_names, ''::text)), 0, 8000)) @@ to_tsquery('test'::text)))
         Rows Removed by Filter: 479437
         SubPlan 1
           ->  Index Only Scan using permission_target_trashed on materialized_permission_view  (cost=0.28..6.04 rows=1 width=28) (actual time=0.017..0.048 rows=34 loops=1)
                 Index Cond: (trashed = 1)
                 Heap Fetches: 34
 Planning time: 0.278 ms
 Execution time: 24189.271 ms
Actions #7

Updated by Tom Clegg about 5 years ago

The index uses substr(..., 1000000) but the query uses substr(..., 8000).

"explain analyze" execution time drops from 24 s to 0.3 s with the substr() argument changed to 1000000.

Actions #8

Updated by Tom Clegg about 5 years ago

  • Related to Bug #14560: [1.3.0] error: ERROR: string is too long for tsvector (2299194 bytes, max 1048575 bytes) added
Actions #10

Updated by Peter Amstutz over 4 years ago

  • Target version deleted (To Be Groomed)
  • Status changed from New to Closed
Actions

Also available in: Atom PDF