Project

General

Profile

Actions

Bug #14878

closed

[API] Fix unreliable test

Added by Tom Clegg almost 6 years ago. Updated over 5 years ago.

Status:
Resolved
Priority:
Normal
Assigned To:
Category:
Tests
Target version:
Story points:
-

Description

"Update priority" tests fail occasionally.

From https://ci.curoverse.com/job/developer-run-tests-services-api/1095/console

  1) Failure:
UpdatePriorityTest#test_priority_0_but_should_be_>0 [/home/ci-jenkins/.jenkins-slave/workspace/developer-run-tests-services-api/services/api/test/unit/update_priority_test.rb:20]:
Expected 0 to be < 0.

Possible explanation: if a background update task (UpdatePriority.run_update_thread(), started from an after_commit callback in a previous test) is still running when the test case does an explicit inline call to UpdatePriority.update_priority(), the inline call fails to get the lock, and returns immediately. But the background task's updates do not necessarily incorporate the latest priority updates, and even if they do, the resulting updates are not necessarily committed before the test case checks for them.

Possible ways to fix:
  • Don't start the background update thread in the test environment.
  • Pass a "block if needed" flag to update_priority(), so the test case works even when background tasks are running.

Subtasks 1 (0 open1 closed)

Task #14955: Review 14878-priority-raceResolvedTom Clegg05/28/2019Actions
Actions #1

Updated by Tom Clegg almost 6 years ago

  • Description updated (diff)
Actions #2

Updated by Tom Morris almost 6 years ago

  • Target version set to 2019-03-27 Sprint
Actions #3

Updated by Lucas Di Pentima almost 6 years ago

  • Assigned To set to Lucas Di Pentima
Actions #4

Updated by Tom Morris almost 6 years ago

  • Target version changed from 2019-03-27 Sprint to 2019-04-10 Sprint
Actions #5

Updated by Lucas Di Pentima almost 6 years ago

  • Target version changed from 2019-04-10 Sprint to 2019-04-24 Sprint
Actions #6

Updated by Lucas Di Pentima over 5 years ago

  • Target version changed from 2019-04-24 Sprint to 2019-05-08 Sprint
Actions #7

Updated by Lucas Di Pentima over 5 years ago

  • Target version changed from 2019-05-08 Sprint to 2019-05-22 Sprint
Actions #8

Updated by Lucas Di Pentima over 5 years ago

  • Target version changed from 2019-05-22 Sprint to 2019-06-05 Sprint
Actions #9

Updated by Tom Clegg over 5 years ago

  • Assigned To changed from Lucas Di Pentima to Tom Clegg
Actions #10

Updated by Tom Clegg over 5 years ago

On second thought, disabling the background thread (and updating synchronously instead) in test cases seems undesirable: it would prevent all tests (even other components' integration tests) from experiencing async behavior.

A blocking flock() doesn't work: the background task sometimes gets the lock, then does a postgresql statement that blocks to wait for the main thread to commit/rollback. If the main thread then waits for the lock before (instead of) committing, everyone deadlocks in a way that postgresql can't detect.

Instead, solved by just skipping the lock when running "update priority" from the test case.

14878-priority-race @ 435a5df3e505dfbf67467bd02073f97e63c4c61d -- https://ci.curoverse.com/view/Developer/job/developer-run-tests/1262/

Actions #11

Updated by Tom Clegg over 5 years ago

  • Status changed from New to In Progress
Actions #12

Updated by Lucas Di Pentima over 5 years ago

I tried to make master fail by running update_priority_test.rb repeatedly on my local machine but wasn't able to reproduce the flaky behavior, even running the entire unit test suite several times.

I think it would be useful to add a comment on update_priority() explaining noblock's use, for posterity. Apart from that, it LGTM.

Actions #13

Updated by Tom Clegg over 5 years ago

FWIW, my dev box reproduced the failure about 30% of the time on master, and passed 10 times in a row with this change.

Actions #14

Updated by Tom Clegg over 5 years ago

  • Status changed from In Progress to Resolved
Actions

Also available in: Atom PDF