Bug #14009
closedcontainer request creation very slow when there are many potentially reusable containers
Description
Container request creation on our system was typically taking 90-120s (N.B. this is after disabling all audit logging so it has already been sped up dramatically from stock). We found this to be due to the queries that look up the reusable containers were the query conditions were pretty much all on columns that have no indices of any kind.
Our interim solution has been to add a hash-based index (to keep index size down) on the `command` column:
arvados_api_production=# create index index_containers_on_command_hash on containers using hash (command);
After adding this index, container request creation was sped up by ~20x (to around 5s each).
However, note that hash indices are not recommended for postgres versions < 10 as they are not included in WAL replication. An alternative would be to have explicit hash columns included in the table (which are btree indexed) that can be used to query the exact-match fields for container reuse.
In addition to making for faster reuse queries, this should also pave the way to enable formulation of client-side container queries that don't need to transmit the large text values (such as mounts, command, etc) to the API server.