Bug #14966
closed[API] Fix hanging test - suspect permission changes
Description
test console https://ci.curoverse.com/job/run-tests-services-api/2605/console
... 12:55:16 Arvados::V1::KeepDisksControllerTest#test_refuse_to_add_keep_disk_without_admin_token = 0.01 s = . 12:55:16 Arvados::V1::KeepDisksControllerTest#test_search_keep_services_with_'any'_operator = 0.03 s = . [hangs here]
ps
postgres 15393 0.4 1.1 329040 91640 ? Ss 16:49 0:18 postgres: 9.6/main: ci_test_user arvados_test_api ::1(40670) REFRESH MATERIALIZED VIEW waiting postgres 15519 0.2 1.0 318256 82320 ? Ss 16:50 0:08 postgres: 9.6/main: ci_test_user arvados_test_api ::1(40706) idle in transaction postgres 15521 0.2 1.2 336216 92800 ? Ss 16:50 0:08 postgres: 9.6/main: ci_test_user arvados_test_api ::1(40708) PARSE waiting
Suspicious of a race condition in the async-permission-update changes recently merged in #13593.
Related issues
Updated by Tom Clegg over 5 years ago
- Related to Bug #13593: [API] Sequence of "create group" requests runs slowly, and can crash API server added
Updated by Lucas Di Pentima over 5 years ago
- Assigned To set to Lucas Di Pentima
- Target version set to 2019-03-27 Sprint
Updated by Lucas Di Pentima over 5 years ago
- Status changed from New to In Progress
Updated by Lucas Di Pentima over 5 years ago
Updates at 0e9aa62c7 - branch 14966-api-test-hang-fix
Test run: https://ci.curoverse.com/job/developer-run-tests/1123/
- Facts
- Minitest has a feature that’s active by default called ’transactional fixtures’ that encloses every test in a transaction so it’s rolled back before running the next.
- Nested transactions share the same database connection (https://apidock.com/rails/ActiveRecord/ConnectionAdapters/DatabaseStatements/transaction)
- The permissions graph update is run inside a transaction
- Hypothesis
- When a permission-changing operation is made within a test, we’re getting nested transactions that share the same database connection. This is usually fine because both run within the same thread, but for this special case that we expect the inner transaction to run concurrently in its own thread, it doesn’t happen because the connection is already occupied and sits waiting to be released before continuing.
- Fix
- Although I already tried to manually ask for a new connection using
ActiveRecord::Base.connection_pool.with_connection do ... end
onrefresh_permission_view()
, it seems there’s no obvious way to ask for a new connection so instead the proposed fix would be to move this special test on its own test case class, disabling transactional fixtures. - The “cons” that this has is that the changes made on this test would leak to other tests, so we have to make sure to clean up before finishing.
- Although I already tried to manually ask for a new connection using
Updated by Peter Amstutz over 5 years ago
Lucas Di Pentima wrote:
Updates at 0e9aa62c7 - branch
14966-api-test-hang-fix
Test run: https://ci.curoverse.com/job/developer-run-tests/1123/
- Facts
- Minitest has a feature that’s active by default called ’transactional fixtures’ that encloses every test in a transaction so it’s rolled back before running the next.
- Nested transactions share the same database connection (https://apidock.com/rails/ActiveRecord/ConnectionAdapters/DatabaseStatements/transaction)
- The permissions graph update is run inside a transaction
- Hypothesis
- When a permission-changing operation is made within a test, we’re getting nested transactions that share the same database connection. This is usually fine because both run within the same thread, but for this special case that we expect the inner transaction to run concurrently in its own thread, it doesn’t happen because the connection is already occupied and sits waiting to be released before continuing.
- Fix
- Although I already tried to manually ask for a new connection using
ActiveRecord::Base.connection_pool.with_connection do ... end
onrefresh_permission_view()
, it seems there’s no obvious way to ask for a new connection so instead the proposed fix would be to move this special test on its own test case class, disabling transactional fixtures.- The “cons” that this has is that the changes made on this test would leak to other tests, so we have to make sure to clean up before finishing.
I'm not sure I totally understand the explanation, but agree that transactions / connection pooling could be causing trouble. Since it is an integration test it seems reasonable to run the test without the an outer transaction. Couple notes:
The test should be wrapped in a begin ... ensure .. end
.
To reset the database, suggest using the database reset endpoint:
post '/database/reset', {}, admin_auth assert_response :success
typo "accesible" -> "accessible"
Updated by Lucas Di Pentima over 5 years ago
Updates at 0145b3654
Test run: https://ci.curoverse.com/job/developer-run-tests/1126/
Instead of resetting the database inside an ensure
block, do it at test case's teardown.
Updated by Lucas Di Pentima over 5 years ago
- Status changed from In Progress to Resolved
- % Done changed from 0 to 100
Applied in changeset arvados|dc71f08c4a78e6e53a4da6f88fd054a9e39600ad.