Bug #15602
openWorkbench Integration tests errors
Description
We see some errors when running this tests, they randomly fail even when we run the same commit, sometimes it pass and others it fail.
these are for example, errors found on different runs:
11:19:55 3) Error:
11:19:55 AnonymousAccessTest#test_view_file:
11:19:55 Capybara::ExpectationNotMet: expected to find text "8pkt3bt0ebd9shrni6p8pda42295gsz17eo86ejof7pbiq0ozwy11z5s03q8ycg9kz976jf6nnt6c9maoipil1vgsoxrvuvqnj8" in "Internal Server Error ActionController::UnknownFormat WEBrick/1.4.2 (Ruby/2.5.5/2019-03-15) at 127.0.0.1:44151"
11:19:55 test/integration/anonymous_access_test.rb:126:in `block in <class:AnonymousAccessTest>'
11:19:55 test/test_helper.rb:285:in `run'
11:19:55
11:19:55 Error:
11:19:55 AnonymousAccessTest#test_view_file:
11:19:55 ActionController::UnknownFormat: ActionController::UnknownFormat
11:19:55 app/controllers/application_controller.rb:58:in `render_error'
11:19:55 app/controllers/application_controller.rb:106:in `render_exception'
15:46:22 2) Error:
15:46:22 ContainerRequestsTest#test_set_input_ex_float_with_12.34:
15:46:22 Capybara::ElementNotFound: Unable to find css ".editable[data-name='[mounts][/var/lib/cwl/cwl.input.json][content][ex_float]']"
15:46:22 test/integration/container_requests_test.rb:25:in `block (2 levels) in <class:ContainerRequestsTest>'
15:46:22 test/test_helper.rb:285:in `run'
15:46:22 6) Error:
15:46:22 AnonymousAccessTest#test_anonymous_user_accesses_subprojects_tab_in_shared_project:
15:46:22 AbstractController::DoubleRenderError: Render and/or redirect were called multiple times in this action. Please note that you may only call render OR redirect, and at most once per action. Also note that neither redirect nor render terminate execution of the action, so if you want to exit an action after redirecting, you need to do something like "redirect_to(...) and return".
15:46:22 app/controllers/application_controller.rb:63:in `block (2 levels) in render_error'
15:46:22 app/controllers/application_controller.rb:58:in `render_error'
15:46:22 app/controllers/application_controller.rb:106:in `render_exception'
17:35:34 5) Error:
17:35:34 ProjectsTest#test_Create_a_project_and_move_it_into_a_different_project:
17:35:34 Capybara::Poltergeist::TimeoutError: Timed out waiting for response to {"name":"visit","args":["http://127.0.0.1:45155/projects?api_token=3kg6k6lzmp9kj5cpkcoxie963cmvjahbt2fod9zru30k1jqdmi"]}. It's possible that this happened because something took a very long time (for example a page load was slow). If so, setting the Poltergeist :timeout option to a higher value will help (see the docs for details). If increasing the timeout does not help, this is probably a bug in Poltergeist - please report it to the issue tracker.
17:35:34 test/integration/projects_test.rb:44:in `block in <class:ProjectsTest>'
17:35:34 test/test_helper.rb:285:in `run'
17:35:34
17:35:34 6) Error:
17:35:34 ProjectsTest#test_error_while_loading_tab:
17:35:34 Capybara::ExpectationNotMet: expected to find css "tr[data-kind=\"arvados#trait\"]" but there were no matches
17:35:34 test/integration/projects_test.rb:559:in `block (3 levels) in <class:ProjectsTest>'
17:35:34 test/integration/projects_test.rb:558:in `block (2 levels) in <class:ProjectsTest>'
17:35:34 test/integration/projects_test.rb:549:in `block in <class:ProjectsTest>'
17:35:34 test/test_helper.rb:285:in `run'
17:35:34
17:35:34 7) Error:
17:35:34 ProjectsTest#test_first_tab_loads_data_when_visiting_other_tab_directly:
17:35:34 Net::ReadTimeout: Net::ReadTimeout
17:35:34 test/integration/projects_test.rb:579:in `block in <class:ProjectsTest>'
17:35:34 test/test_helper.rb:285:in `run'
17:35:34
17:35:34 Error:
17:35:34 ProjectsTest#test_first_tab_loads_data_when_visiting_other_tab_directly:
17:35:34 Net::ReadTimeout: Net::ReadTimeout
17:35:34 test/integration_helper.rb:199:in `screenshot'
17:35:34 test/integration_helper.rb:210:in `block in <class:IntegrationTest>'
17:35:34 test/test_helper.rb:295:in `after_teardown'
17:35:34 test/test_helper.rb:285:in `run'
17:35:34
17:35:34 8) Error:
17:35:34 ProjectsTest#test_move_selected_and_remove_selected_actions_not_available_when_current_user_cannot_write_to_project:
17:35:34 Capybara::Poltergeist::TimeoutError: Timed out waiting for response to {"name":"visit","args":["http://127.0.0.1:45155/projects/zzzzz-j7d0g-zhxawtyetzwc5f0?api_token=3kg6k6lzmp9kj5cpkcoxie963cmvjahbt2fod9zru30k1jqdmi"]}. It's possible that this happened because something took a very long time (for example a page load was slow). If so, setting the Poltergeist :timeout option to a higher value will help (see the docs for details). If increasing the timeout does not help, this is probably a bug in Poltergeist - please report it to the issue tracker.
17:35:34 test/integration/projects_test.rb:374:in `block in <class:ProjectsTest>'
17:35:34 test/test_helper.rb:285:in `run'
11:28:24 2) Error:
11:28:24 ProjectsTest#test_error_while_loading_tab:
11:28:24 Capybara::Poltergeist::TimeoutError: Timed out waiting for response to {"name":"visit","args":["http://127.0.0.1:45941/projects/zzzzz-j7d0g-v955i6s2oi1cbso?api_token=3kg6k6lzmp9kj5cpkcoxie963cmvjahbt2fod9zru30k1jqdmi"]}. It's possible that this happened because something took a very long time (for example a page load was slow). If so, setting the Poltergeist :timeout option to a higher value will help (see the docs for details). If increasing the timeout does not help, this is probably a bug in Poltergeist - please report it to the issue tracker.
11:28:24 test/integration/projects_test.rb:544:in `block in <class:ProjectsTest>'
11:28:24 test/test_helper.rb:285:in `run'
11:28:24
11:28:24 3) Failure:
11:28:24 ProjectsTest#test_project_owner_can_manage_sharing_for_another_group [/tmp/workspace/developer-run-tests-apps-workbench-integration/apps/workbench/test/helpers/share_object_helper.rb:56]:
11:28:24 failed to change access level on new share
11:28:24
11:28:24 4) Error:
11:28:24 ProjectsTest#test_project_owner_can_manage_sharing_for_another_user:
11:28:24 Capybara::Poltergeist::TimeoutError: Timed out waiting for response to {"name":"visit","args":["http://127.0.0.1:45941/groups/zzzzz-j7d0g-v955i6s2oi1cbso?api_token=3kg6k6lzmp9kj5cpkcoxie963cmvjahbt2fod9zru30k1jqdmi"]}. It's possible that this happened because something took a very long time (for example a page load was slow). If so, setting the Poltergeist :timeout option to a higher value will help (see the docs for details). If increasing the timeout does not help, this is probably a bug in Poltergeist - please report it to the issue tracker.
11:28:24 test/helpers/share_object_helper.rb:8:in `show_object_using'
11:28:24 test/integration/projects_test.rb:116:in `block in <class:ProjectsTest>'
11:28:24 test/test_helper.rb:285:in `run'
11:28:24
11:28:24 5) Error:
11:28:24 ProjectsTest#test_projects_not_publicly_sharable_when_anonymous_browsing_disabled:
11:28:24 Capybara::Poltergeist::TimeoutError: Timed out waiting for response to {"name":"visit","args":["http://127.0.0.1:45941/projects/zzzzz-j7d0g-v955i6s2oi1cbso?api_token=3kg6k6lzmp9kj5cpkcoxie963cmvjahbt2fod9zru30k1jqdmi"]}. It's possible that this happened because something took a very long time (for example a page load was slow). If so, setting the Poltergeist :timeout option to a higher value will help (see the docs for details). If increasing the timeout does not help, this is probably a bug in Poltergeist - please report it to the issue tracker.
11:28:24 test/integration/projects_test.rb:86:in `open_groups_sharing'
11:28:24 test/integration/projects_test.rb:97:in `block in <class:ProjectsTest>'
11:28:24 test/test_helper.rb:285:in `run'
11:28:24
11:28:24 6) Error:
11:28:24 ProjectsTest#test_projects_publicly_sharable_when_anonymous_browsing_enabled:
11:28:24 Capybara::Poltergeist::TimeoutError: Timed out waiting for response to {"name":"visit","args":["http://127.0.0.1:45941/projects/zzzzz-j7d0g-v955i6s2oi1cbso?api_token=3kg6k6lzmp9kj5cpkcoxie963cmvjahbt2fod9zru30k1jqdmi"]}. It's possible that this happened because something took a very long time (for example a page load was slow). If so, setting the Poltergeist :timeout option to a higher value will help (see the docs for details). If increasing the timeout does not help, this is probably a bug in Poltergeist - please report it to the issue tracker.
11:28:24 test/integration/projects_test.rb:86:in `open_groups_sharing'
11:28:24 test/integration/projects_test.rb:107:in `block in <class:ProjectsTest>'
11:28:24 test/test_helper.rb:285:in `run'
11:28:24
11:28:24 7) Error:
11:28:24 ProjectsTest#test_run_a_process_using_template_Workflow_with_input_specifications_in_a_project:
11:28:24 Capybara::Poltergeist::TimeoutError: Timed out waiting for response to {"name":"visit","args":["http://127.0.0.1:45941/projects/zzzzz-j7d0g-v955i6s2oi1cbso?api_token=3kg6k6lzmp9kj5cpkcoxie963cmvjahbt2fod9zru30k1jqdmi"]}. It's possible that this happened because something took a very long time (for example a page load was slow). If so, setting the Poltergeist :timeout option to a higher value will help (see the docs for details). If increasing the timeout does not help, this is probably a bug in Poltergeist - please report it to the issue tracker.
11:28:24 test/integration/projects_test.rb:742:in `block (2 levels) in <class:ProjectsTest>'
11:28:24 test/test_helper.rb:285:in `run'
11:28:24
11:28:24 8) Error:
11:28:24 ProjectsTest#test_scroll_collections_tab_for_project_with_10_collections_with_10_objects:
11:28:24 Capybara::Poltergeist::TimeoutError: Timed out waiting for response to {"name":"visit","args":["http://127.0.0.1:45941/projects/zzzzz-j7d0g-0010collections?api_token=1234k6lzmp9kj5cpkcoxie963cmvjahbt2fod9zru30k1jqdmi"]}. It's possible that this happened because something took a very long time (for example a page load was slow). If so, setting the Poltergeist :timeout option to a higher value will help (see the docs for details). If increasing the timeout does not help, this is probably a bug in Poltergeist - please report it to the issue tracker.
11:28:24 test/integration/projects_test.rb:425:in `scroll_setup'
11:28:24 test/integration/projects_test.rb:490:in `block (2 levels) in <class:ProjectsTest>'
11:28:24 test/test_helper.rb:285:in `run'
11:28:24
11:28:24 9) Error:
11:28:24 ProjectsTest#test_scroll_collections_tab_for_project_with_10_collections_with_10_objects_with_ascending_sort_(case_insensitive):
11:28:24 Capybara::Poltergeist::TimeoutError: Timed out waiting for response to {"name":"visit","args":["http://127.0.0.1:45941/projects/zzzzz-j7d0g-0010collections?api_token=1234k6lzmp9kj5cpkcoxie963cmvjahbt2fod9zru30k1jqdmi"]}. It's possible that this happened because something took a very long time (for example a page load was slow). If so, setting the Poltergeist :timeout option to a higher value will help (see the docs for details). If increasing the timeout does not help, this is probably a bug in Poltergeist - please report it to the issue tracker.
11:28:24 test/integration/projects_test.rb:425:in `scroll_setup'
11:28:24 test/integration/projects_test.rb:506:in `block (2 levels) in <class:ProjectsTest>'
11:28:24 test/test_helper.rb:285:in `run'
11:28:24
11:28:24 10) Error:
11:28:24 ProjectsTest#test_scroll_collections_tab_for_project_with_201_collections_with_201_objects:
11:28:24 Capybara::Poltergeist::TimeoutError: Timed out waiting for response to {"name":"visit","args":["http://127.0.0.1:45941/projects/zzzzz-j7d0g-0201collections?api_token=1234k6lzmp9kj5cpkcoxie963cmvjahbt2fod9zru30k1jqdmi"]}. It's possible that this happened because something took a very long time (for example a page load was slow). If so, setting the Poltergeist :timeout option to a higher value will help (see the docs for details). If increasing the timeout does not help, this is probably a bug in Poltergeist - please report it to the issue tracker.
11:28:24 test/integration/projects_test.rb:425:in `scroll_setup'
11:28:24 test/integration/projects_test.rb:490:in `block (2 levels) in <class:ProjectsTest>'
11:28:24 test/test_helper.rb:285:in `run
12:01:33 4) Error:
12:01:33 CollectionUploadTest#test_Create_new_collection_using_upload_button:
12:01:33 Capybara::Poltergeist::DeadClient: PhantomJS client died while processing {"name":"visit","args":["http://127.0.0.1:37811/projects/zzzzz-j7d0g-v955i6s2oi1cbso?api_token=3kg6k6lzmp9kj5cpkcoxie963cmvjahbt2fod9zru30k1jqdmi"]}
12:01:33 test/integration/collection_upload_test.rb:34:in `block in <class:CollectionUploadTest>'
12:01:33 test/test_helper.rb:285:in `run'
Updated by Nico César over 5 years ago
I'm comparing https://ci.curoverse.com/job/run-tests-apps-workbench-integration/3058/artifact/tmp/nginx_access.log (failed) with https://ci.curoverse.com/job/run-tests-apps-workbench-integration/3060/artifact/tmp/nginx_access.log (successful)
the main difference that I see is that 3058 has several times that 3 database/reset are called at once. This is what caught my eye (note the timestamps):
17:10:16 ARVADOS_API_HOST=0.0.0.0:52073 17:10:16 ARVADOS_TEST_API_HOST=0.0.0.0:42573 (..) 17:13:18 SmokeTest#test_mithril_test_page = 1.81 s = . 17:13:19 Using port 37603 for selenium 17:13:30 Saved ./tmp/workbench-fail-2.png 17:13:31 AjaxErrorsTest#test_load_pane_with_deleted_session = 12.83 s = S 17:13:32 Saved ./tmp/workbench-fail-3.png 17:13:32 AjaxErrorsTest#test_load_pane_with_expired_token = 1.11 s = S 17:13:35 JobsTest#test_view_job_with_components_as_active_user = 2.89 s = .
[28/Aug/2019:17:13:17 +0000] "-" controller 200 196847 0.343 GET "https://0.0.0.0:52073/discovery/v1/apis/arvados/v1/rest" 127.0.0.1:56050 "http://127.0.0.1:33827/tests/mithril?" "Mozilla/5.0 (Unknown; Linux x86_64) AppleWebKit/534.34 (KHTML, like Gecko) PhantomJS/1.9.8 Safari/534.34" [28/Aug/2019:17:13:18 +0000] "" controller 200 16 0.919 POST "https://0.0.0.0:52073/database/reset" 127.0.0.1:56054 "-" "HTTPClient/1.0 (2.8.3, ruby 2.5.5 (2019-03-15))" [28/Aug/2019:17:13:19 +0000] "" controller 200 16 0.896 POST "https://0.0.0.0:52073/database/reset" 127.0.0.1:56060 "-" "HTTPClient/1.0 (2.8.3, ruby 2.5.5 (2019-03-15))" [28/Aug/2019:17:13:32 +0000] "" controller 200 16 1.025 POST "https://0.0.0.0:52073/database/reset" 127.0.0.1:56268 "-" "HTTPClient/1.0 (2.8.3, ruby 2.5.5 (2019-03-15))" [28/Aug/2019:17:13:32 +0000] "req-bx4pocu9qao4m3fjtm8k" controller 200 778 0.015 POST "https://0.0.0.0:52073/arvados/v1/users/current" 127.0.0.1:56296 "-" "HTTPClient/1.0 (2.8.3, ruby 2.5.5 (2019-03-15))" [28/Aug/2019:17:13:32 +0000] "req-4564ce5wjk5q1b6x83c8" controller 200 778 0.015 POST "https://0.0.0.0:52073/arvados/v1/users/current" 127.0.0.1:56296 "-" "HTTPClient/1.0 (2.8.3, ruby 2.5.5 (2019-03-15))"
as you can see API server got contacted al from 127.0.0.1:56054 . 127.0.0.1:56060 and 127.0.0.1:56268 clients.
Are we running into a race condition where the database/db_engine is either locked or severally delayed to accept transaction?