Project

General

Profile

Actions

Bug #22435

closed

Ensure that DB migrations do not have request timeouts by default

Added by Peter Amstutz 14 days ago. Updated 6 days ago.

Status:
Resolved
Priority:
Normal
Assigned To:
Category:
API
Target version:
Story points:
-
Release relationship:
Auto

Subtasks 1 (0 open1 closed)

Task #22449: Review 22435-no-migration-timeoutsResolvedTom Clegg01/16/2025Actions
Actions #1

Updated by Peter Amstutz 14 days ago

  • Position changed from -933179 to -933178
Actions #2

Updated by Peter Amstutz 14 days ago

  • Assigned To set to Tom Clegg
Actions #3

Updated by Peter Amstutz 13 days ago

  • Subtask #22449 added
Actions #4

Updated by Tom Clegg 8 days ago

  • Status changed from New to In Progress

22435-no-migration-timeouts @ 1b9c6f10d4f9786ac574dfaa3f7e43295c0e90f1 -- developer-run-tests: #4601

  • All agreed upon points are implemented / addressed. Describe changes from pre-implementation design.
  • Anything not implemented (discovered or discussed during work) has a follow-up story.
    • N/A
  • Code is tested and passing, both automated and manual, what manual testing was done is described.
    • ✅ tested manually by inspecting services/api/log/test.log and by changing the new timeout to 3 milliseconds and watching migrations fail
    • StandardError: An error has occurred, this and all later migrations canceled: (StandardError)
      
      PG::QueryCanceled: ERROR:  canceling statement due to statement timeout
      
      [stack trace]
      
  • New or changed UX/UX and has gotten feedback from stakeholders.
    • N/A
  • Documentation has been updated.
    • N/A
  • Behaves appropriately at the intended scale (describe intended scale).
    • ✅ new strategy is that we have infinite patience for running migrations, no matter how big the database is
  • Considered backwards and forwards compatibility issues between client and server.
    • N/A
  • Follows our coding standards and GUI style guidelines.
Other notes
  • Implementation consists of installing a "before running any migration rake task" hook that sets statement_timeout and lock_timeout to 0.
  • I first tried using a built-in feature for this (allegedly the pg driver can use &options=-c%20statement_timeout%3D0 in the database connection string) but that gets overridden by our config loader. We could change our config loader to modify the connection string but I decided to leave that alone. Except that I noticed our config loader missed CGI-escaping some strings that it was using to build the database connection string, so I figured I might as well fix that to avoid future problems when someone puts a % in a database template name or something.
  • My magic spell for "set all the GEM_* env vars and run rake tasks" scrolled off my bash history so I finally got around to adding a "migrate" command to run-tests.sh. Now we can do "migrate", "migrate down VERSION=12345", "migrate rollback" at the What next? prompt in interactive mode.
Actions #5

Updated by Brett Smith 7 days ago

Functionally everything looks good, I like the approach. I have pushed 794c6ba1acf27af4a0d6c0d36a6b5306b861a998 with some suggested changes to the new migrate commands to reduce their impact on global state. I don't insist on them, if they offend you we can drop them, but just pushing seemed like the easiest way to communicate the point. If you have questions about any of it, happy to answer those.

Because I have my fancy new setup that uses a custom CONFIGSRC, my first attempt to run a migrate command returned this error:

TypeError: no implicit conversion of nil into String
/home/brett/Curii/arvados/services/api/config/arvados_config.rb:282:in `escape'
/home/brett/Curii/arvados/services/api/config/arvados_config.rb:282:in `<top (required)>'
/home/brett/Curii/arvados/services/api/config/application.rb:33:in `require_relative'
/home/brett/Curii/arvados/services/api/config/application.rb:33:in `<class:Application>'
/home/brett/Curii/arvados/services/api/config/application.rb:31:in `<module:Server>'
/home/brett/Curii/arvados/services/api/config/application.rb:30:in `<top (required)>'
/home/brett/Curii/arvados/services/api/Rakefile:9:in `require'
/home/brett/Curii/arvados/services/api/Rakefile:9:in `<top (required)>'
/home/brett/.local/share/gem/ruby/3.1.0/gems/rake-13.2.1/exe/rake:27:in `<top (required)>'

This is because client_encoding is not a documented field for config.yml but the code assumes it is defined following Rails' configuration pattern. I did a quick workaround by doing a ||= assignment in arvados_config.rb, but I'm not sure that's actually the right solution. Should config.yml let this be configured? Should arvados_config.rb set it unconditionally for tests like it does for collation? What happens in production when we're not using the test environment overrides in arvados_config.rb?

Actions #6

Updated by Tom Clegg 6 days ago

I've addressed the "implicit conversion of nil into String" issue with .to_s. Good catch. I forgot those defaults come from the test config, not the default config. I considered the option of leaving missing entries out of the connection string entirely instead of setting them to blank, but it seems safer to use the same connection string we were using before this branch (except for the actual encoding thing I was trying to fix, which should be strictly an improvement). Background: we put these in the test config because some tests are sensitive to encoding/locale, and we don't want to assume the test host is set up that way.

Your run-tests.sh changes LGTM, thanks.

22435-no-migration-timeouts @ 149de0bedc7c8a6c2ae3c8e47a72f311adbc7aa0 -- developer-run-tests: #4607

Actions #7

Updated by Brett Smith 6 days ago

Tom Clegg wrote in #note-6:

22435-no-migration-timeouts @ 149de0bedc7c8a6c2ae3c8e47a72f311adbc7aa0 -- developer-run-tests: #4607

LGTM, thanks.

Actions #8

Updated by Tom Clegg 6 days ago

  • Status changed from In Progress to Resolved
Actions #9

Updated by Tom Clegg 6 days ago

  • Release set to 75
Actions

Also available in: Atom PDF