Bug #21030
closedPermission table slowness
Description
Reported that permission updates on a large / busy cluster get slower and slower over time. Doing a "VACUUM FULL" fixes it but the slowness comes back after a few days.
Currently speculating that it may be due to the fact that every request that updates permission, trashed projects, or frozen projects currently creates a temporary table.
https://stackoverflow.com/questions/50366509/temporary-tables-bloating-pg-attribute
I think the queries that use temporary tables can be rewritten to use a CTE fairly easily, so that's worth a try.
Updated by Peter Amstutz about 1 year ago
- Status changed from New to In Progress
Updated by Peter Amstutz about 1 year ago
- Category deleted (
API) - Description updated (diff)
Updated by Peter Amstutz about 1 year ago
- Category set to API
- Description updated (diff)
Updated by Peter Amstutz about 1 year ago
21030-update-perm-cte @ 26510a8ee080eac922abd6c981e2f077fe1a2f58
Updated by Peter Amstutz about 1 year ago
- Target version changed from Development 2023-10-11 sprint to Development 2023-10-25 sprint
Updated by Peter Amstutz about 1 year ago
Winner winner chicken dinner
create index index_materialized_permissions_target_is_not_user3 on materialized_permissions (target_uuid, traverse_owned, (user_uuid = target_uuid or target_uuid not like '_____-tpzed-_______________'));
Updated by Peter Amstutz about 1 year ago
I'm abandoning this branch but recording it here just for history
21030-target-is-user 2e14158bd0066efa48cb971cde7f8bf69de44651
Updated by Peter Amstutz about 1 year ago
21030-update-perm-cte @ 4727334c7ef28cdb7fb2df89211c09eb5d51bef4
Note each item completed with additional detail if necessary. If an item is irrelevant to a specific branch, briefly explain why.
- All agreed upon points are implemented / addressed.
- The main bottleneck on permission updates has been identified, and the solution has been confirmed to improve performance on real-world user clusters.
- Anything not implemented (discovered or discussed during work) has a follow-up story.
- n/a
- Code is tested and passing, both automated and manual, what manual testing was done is described
- The changes to remove temporary tables are pass all tests
- The new index has been manually confirmed to improve performance by looking at query plans & query execution times
- Documentation has been updated.
- n/a
- Behaves appropriately at the intended scale (describe intended scale).
- By avoiding a sequential scan, this both greatly improves performance and should hopefully also avoid the performance deterioration of accumulated dead rows between table vacuums.
- Considered backwards and forwards compatibility issues between client and server.
- n/a
- Follows our coding standards and GUI style guidelines.
- yes
This branch has two improvements in response to reported performance problems.
The first improvement is to eliminate the creation of temporary tables. This turned out to not be the main source of performance problems, however research indicated that it is not best practice to use temporary tables for frequently executed queries, and confirmed that the pg_attributes table was accumulating dead rows due to the use of temporary tables. The temporary tables have been replaced by Computed Table Expressions (CTEs) which behave equivalently for our purposes but apparently have less overhead.
The second improvement, which does make a large performance improvement, is a new index on materialized permissions. Here's the comment in the code:
# The inner part of compute_permission_subgraph has a query clause like this: # # where u.perm_origin_uuid = m.target_uuid AND m.traverse_owned # AND (m.user_uuid = m.target_uuid or m.target_uuid not like '_____-tpzed-_______________') # # This will end up doing a sequential scan on # materialized_permissions, which can easily have millions of # rows, unless we fully index the table for this query. In one test, # this brought the compute_permission_subgraph query from over 6 # seconds down to 250ms.
A third detail is that we confirmed that the "SET LOCAL enable_mergejoin to false;" workaround is still needed on Postgres 14, so you'll see in the branch history that I took it out and then put it back.
Updated by Brett Smith about 1 year ago
Peter Amstutz wrote in #note-11:
21030-update-perm-cte @ 4727334c7ef28cdb7fb2df89211c09eb5d51bef4
LGTM, thanks.
Updated by Peter Amstutz about 1 year ago
- Status changed from In Progress to Resolved