Project

General

Profile

Bug #13972

Updated by Tom Morris over 6 years ago

Requests to list collections by PDH and name (I think these are coming from a-c-r) sometimes take a very long time (90s+): 
 <pre> 
 { 
   "method": "GET", 
   "path": "/arvados/v1/collections", 
   "format": "json", 
   "controller": "Arvados::V1::CollectionsController", 
   "action": "index", 
   "status": 200, 
   "duration": 104537.05, 
   "view": 0.33, 
   "db": 104532.55, 
   "request_id": "req-mgxr3wrwt09hytd2jjsc", 
   "client_ipaddr": "10.101.0.7", 
   "client_auth": "eglyx-gj3su-u157eevzet2a0gt", 
   "params": { 
     "alt": "json", 
     "limit": "1", 
     "filters": "[[\"portable_data_hash\", \"=\", \"d41d8cd98f00b204e9800998ecf8427e+0\"], [\"name\", \"like\", \"gatk-4.0.0.0-library-cram-to-gvcfs.cwl dependencies%\"]]" 
   }, 
   "@timestamp": "2018-08-06T17:46:29.775676929Z", 
   "@version": "1", 
   "message": "[200] GET /arvados/v1/collections (Arvados::V1::CollectionsController#index)" 
 } 
 </pre> 

 <pre> 
 root@arvados-api-backend-eglyx-01:/var/www/arvados-api/current/log# grep req-mgxr3wrwt09hytd2jjsc production.log 
 [req-mgxr3wrwt09hytd2jjsc]     Collection Load (49993.9ms)    SELECT    collections."uuid", collections."owner_uuid", collections."created_at", collections."modified_by_client_uuid", collections."modified_by_user_uuid", collections."modified_at", collections."name", collections."description", collections."properties", collections."portable_data_hash", collections."replication_desired", collections."replication_confirmed", collections."replication_confirmed_at", collections."storage_classes_desired", collections."storage_classes_confirmed", collections."storage_classes_confirmed_at", collections."delete_at", collections."trash_at", collections."is_trashed" FROM "collections" WHERE (NOT EXISTS(SELECT 1 FROM materialized_permission_view WHERE trashed = 1 AND (collections.owner_uuid = target_uuid)) AND collections.is_trashed = false) AND ((collections.portable_data_hash = 'd41d8cd98f00b204e9800998ecf8427e+0') AND (collections.name like 'gatk-4.0.0.0-library-cram-to-gvcfs.cwl dependencies%'))    ORDER BY collections.modified_at desc, collections.uuid LIMIT 1 OFFSET 0 
 [req-mgxr3wrwt09hytd2jjsc]      (54538.7ms)    SELECT COUNT("collections"."id") FROM "collections" WHERE (NOT EXISTS(SELECT 1 FROM materialized_permission_view WHERE trashed = 1 AND (collections.owner_uuid = target_uuid)) AND collections.is_trashed = false) AND ((collections.portable_data_hash = 'd41d8cd98f00b204e9800998ecf8427e+0') AND (collections.name like 'gatk-4.0.0.0-library-cram-to-gvcfs.cwl dependencies%')) 
 {"method":"GET","path":"/arvados/v1/collections","format":"json","controller":"Arvados::V1::CollectionsController","action":"index","status":200,"duration":104537.05,"view":0.33,"db":104532.55,"request_id":"req-mgxr3wrwt09hytd2jjsc","client_ipaddr":"10.101.0.7","client_auth":"eglyx-gj3su-u157eevzet2a0gt","params":{"alt":"json","limit":"1","filters":"[[\"portable_data_hash\", \"=\", \"d41d8cd98f00b204e9800998ecf8427e+0\"], [\"name\", \"like\", \"gatk-4.0.0.0-library-cram-to-gvcfs.cwl dependencies%\"]]"},"@timestamp":"2018-08-06T17:46:29.775676929Z","@version":"1","message":"[200] GET /arvados/v1/collections (Arvados::V1::CollectionsController#index)"} 
 </pre> 

 Here is how the database explains it: 
 <pre> 
 arvados_api_production=# explain analyze SELECT    collections."uuid", collections."owner_uuid", collections."created_at", collections."modified_by_client_uuid", collections."modified_by_user_uuid", collections."modified_at", collections."name", collections."description", collections."properties", collections."portable_data_hash", collections."replication_desired", collections."replication_confirmed", collections."replication_confirmed_at", collections."storage_classes_desired", collections."storage_classes_confirmed", collections."storage_classes_confirmed_at", collections."delete_at", collections."trash_at", collections."is_trashed" COUNT("collections"."id") FROM "collections" WHERE (NOT EXISTS(SELECT 1 FROM materialized_permission_view WHERE trashed = 1 AND (collections.owner_uuid = target_uuid)) AND collections.is_trashed = false) AND ((collections.portable_data_hash = 'd41d8cd98f00b204e9800998ecf8427e+0') AND (collections.name like 'gatk-4.0.0.0-library-cram-to-gvcfs.cwl dependencies%'))    ORDER BY collections.modified_at desc, collections.uuid LIMIT 1 OFFSET 0; 
                                                                                    dependencies%')); 
                                                                                 QUERY PLAN 
 --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- --------------------------------------------------------------------------------------------------------------------------------------------------------------------------- 
  Limit Aggregate    (cost=110280.76..110280.76 (cost=110280.74..110280.75 rows=1 width=894) width=4) (actual time=33782.855..33782.855 rows=0 time=38089.806..38089.806 rows=1 loops=1) 
    ->    Sort    (cost=110280.76..110280.77 rows=4 width=894) (actual time=33782.855..33782.855 rows=0 loops=1) 
          Sort Key: collections.modified_at DESC, collections.uuid 
          Sort Method: quicksort    Memory: 25kB 
          ->    Nested Loop Hash Anti Join    (cost=1590.26..110280.74 (cost=1591.75..110280.73 rows=4 width=894) width=4) (actual time=33782.847..33782.847 time=38089.802..38089.802 rows=0 loops=1) 
                Join Filter: 
          Hash Cond: ((collections.owner_uuid)::text = (materialized_permission_view.target_uuid)::text) 
                
          ->    Bitmap Heap Scan on collections    (cost=1590.26..110279.19 rows=5 width=894) width=32) (actual time=33782.846..33782.846 time=38089.800..38089.800 rows=0 loops=1) 
                      
                Recheck Cond: ((portable_data_hash)::text = 'd41d8cd98f00b204e9800998ecf8427e+0'::text) 
                      
                Filter: ((NOT is_trashed) AND ((name)::text ~~ 'gatk-4.0.0.0-library-cram-to-gvcfs.cwl dependencies%'::text)) 
                      
                Rows Removed by Filter: 65732 
                      
                Heap Blocks: exact=46329 
                      
                ->    Bitmap Index Scan on index_collections_on_portable_data_hash    (cost=0.00..1590.26 rows=61293 width=0) (actual time=352.125..352.125 time=350.871..350.871 rows=65732 loops=1) 
                            
                      Index Cond: ((portable_data_hash)::text = 'd41d8cd98f00b204e9800998ecf8427e+0'::text) 
                
          ->    Materialize Hash    (cost=0.00..1.48 (cost=1.48..1.48 rows=1 width=28) (never executed) 
                      
                ->    Seq Scan on materialized_permission_view    (cost=0.00..1.48 rows=1 width=28) (never executed) 
                            
                      Filter: (trashed = 1) 
  Planning time: 18.993 0.332 ms 
  Execution time: 33782.934 38089.866 ms 
 (18 (15 rows) 

 arvados_api_production=# explain analyze SELECT COUNT("collections"."id") FROM "collections" WHERE (NOT EXISTS(SELECT 1 FROM materialized_permission_view WHERE trashed = 1 AND (collections.owner_uuid = target_uuid)) AND collections.is_trashed = false) AND ((collections.portable_data_hash = 'd41d8cd98f00b204e9800998ecf8427e+0') AND (collections.name like 'gatk-4.0.0.0-library-cram-to-gvcfs.cwl dependencies%')); 
                                                                                 QUERY PLAN 
 --------------------------------------------------------------------------------------------------------------------------------------------------------------------------- 
  Aggregate    (cost=110280.74..110280.75 rows=1 width=4) (actual time=36774.915..36774.915 rows=1 loops=1) 
    ->    Hash Anti Join    (cost=1591.75..110280.73 rows=4 width=4) (actual time=36774.911..36774.911 rows=0 loops=1) 
          Hash Cond: ((collections.owner_uuid)::text = (materialized_permission_view.target_uuid)::text) 
          ->    Bitmap Heap Scan on collections    (cost=1590.26..110279.19 rows=5 width=32) (actual time=36774.909..36774.909 rows=0 loops=1) 
                Recheck Cond: ((portable_data_hash)::text = 'd41d8cd98f00b204e9800998ecf8427e+0'::text) 
                Filter: ((NOT is_trashed) AND ((name)::text ~~ 'gatk-4.0.0.0-library-cram-to-gvcfs.cwl dependencies%'::text)) 
                Rows Removed by Filter: 65732 
                Heap Blocks: exact=46329 
                ->    Bitmap Index Scan on index_collections_on_portable_data_hash    (cost=0.00..1590.26 rows=61293 width=0) (actual time=580.662..580.662 rows=65732 loops=1) 
                      Index Cond: ((portable_data_hash)::text = 'd41d8cd98f00b204e9800998ecf8427e+0'::text) 
          ->    Hash    (cost=1.48..1.48 rows=1 width=28) (never executed) 
                ->    Seq Scan on materialized_permission_view    (cost=0.00..1.48 rows=1 width=28) (never executed) 
                      Filter: (trashed = 1) 
  Planning time: 0.388 ms 
  Execution time: 36775.002 ms 
 (15 rows) 
 </pre> 

Back