Story #11807

[API] Migrate old serialized database content from YAML to JSON

Added by Tom Clegg about 1 month ago. Updated 5 days ago.

Status:ResolvedStart date:06/05/2017
Priority:NormalDue date:
Assignee:Tom Clegg% Done:

100%

Category:API
Target version:2017-07-19 sprint
Story points0.5Remaining (hours)0.00 hour
Velocity based estimate-

Description

Test converting the Jobs or Pipeline Instances table in a representative database (4xp or qr1hi) to JSON to smoke out any potential problems as a preliminary investigation. If this shows substantial problems, spawn separate stories for them.

The Logs table is excluded and will be dealt with separately.

Collection provide performance information to Ops to help calculate how long the migration will take and help determine migration strategy.


Subtasks

Task #11921: Update "upgrading" wiki with dev cluster timesResolvedTom Clegg

Task #11892: Review 11807-yaml-to-jsonResolvedRadhika Chippada

Task #11914: Migrate jobs tableResolvedTom Clegg

Task #11923: Migration for other serialized columnsResolvedTom Clegg

Task #11924: Load fixtures as JSONClosedTom Clegg


Related issues

Related to Arvados - Bug #11168: [API] Use JSON instead of YAML for serialized fields in d... Resolved 02/24/2017
Blocks Arvados - Story #11908: Migrate Collections.properties to JSONB In Progress 06/27/2017

Associated revisions

Revision 55aafbb0
Added by Tom Clegg 24 days ago

Merge branch '11807-yaml-to-json'

refs #11807

Arvados-DCO-1.1-Signed-off-by: Tom Clegg <>

History

#1 Updated by Tom Morris about 1 month ago

  • Tracker changed from Bug to Story

#2 Updated by Tom Morris about 1 month ago

  • Description updated (diff)
  • Story points set to 2.0

#3 Updated by Tom Morris about 1 month ago

  • Target version set to 2017-07-19 sprint

#4 Updated by Tom Morris about 1 month ago

  • Target version changed from 2017-07-19 sprint to 2017-07-05 sprint

#5 Updated by Tom Morris about 1 month ago

  • Target version changed from 2017-07-05 sprint to 2017-07-19 sprint

#6 Updated by Tom Morris about 1 month ago

  • Target version changed from 2017-07-19 sprint to 2017-07-05 sprint

#7 Updated by Tom Clegg about 1 month ago

  • Assignee set to Tom Clegg

#8 Updated by Tom Clegg 26 days ago

11807-yaml-to-json @ 7a5eb1b19c698f39b7cfdaafa4b3deefe556b07e
  • migrates content in the serialized columns of the "jobs" table only
  • uses a "migrate column X of table Y" function that should make subsequent column/table migrations trivial to implement

This migration can be run on a live database while an old apiserver is still running, if a sysadmin is sufficiently motivated to avoid downtime to arrange this, instead of doing the usual "apt-get upgrade" procedure.

#9 Updated by Radhika Chippada 25 days ago

I was able to verify that the migration is working correctly in my test database.

LGTM

#10 Updated by Radhika Chippada 24 days ago

I noticed this during my testing of this branch: I migrated my arvados_test db and jobs text fields are now json formatted. But when I run of the tests, the jobs objects are recreated from test fixtures and they now have the yaml encoding for components etc. Wondering if this would cause a problem if were to change the db column type to json or jsonb later? Do the object creations from fixtures need to use this new serialization strategy?

#13 Updated by Tom Clegg 19 days ago

  • Status changed from New to In Progress
  • Target version changed from 2017-07-05 sprint to 2017-07-19 sprint
  • Story points changed from 2.0 to 0.5

#14 Updated by Nico César 18 days ago

this is a deploy in su92l

su92l:~# tail -F /tmp/pupp*
Reading package lists...
Building dependency tree...
Reading state information...
The following packages will be upgraded:
  arvados-api-server
1 upgraded, 0 newly installed, 0 to remove and 35 not upgraded.
Need to get 27.5 MB of archives.
After this operation, 75.8 kB of additional disk space will be used.
Get:1 http://apt.arvados.org xenial/main amd64 arvados-api-server amd64 0.1.20170705132428.ad77601-8 [27.5 MB]
Fetched 27.5 MB in 1s (18.0 MB/s)
(Reading database ... 84880 files and directories currently installed.)
Preparing to unpack .../arvados-api-server_0.1.20170705132428.ad77601-8_amd64.deb ...
Unpacking arvados-api-server (0.1.20170705132428.ad77601-8) over (0.1.20170503164843.1885991-7) ...
Setting up arvados-api-server (0.1.20170705132428.ad77601-8) ...

Assumption: nginx is configured to serve Rails from
            /var/www/arvados-api/current
Assumption: nginx and passenger run as www-data

Creating symlinks to configuration in /etc/arvados/api ...... done.
Running bundle install... done.
Ensuring directory and file permissions ...... done.
Running db:migrate...Defaulting to memory cache, because /var/www/arvados-api/current/tmp/cache owner (uid=33) is not me (uid=0)
DEPRECATION WARNING: The configuration option `config.serve_static_assets` has been renamed to `config.serve_static_files` to clarify its role (it merely enables serving everything in the `public` folder and is unrelated to the asset pipeline). The `serve_static_assets` alias will be removed in Rails 5.0. Please migrate your configuration files accordingly. (called from block in <top (required)> at /var/www/arvados-api/current/config/environments/production.rb:12)
DEPRECATION WARNING: You did not specify a `log_level` in `production.rb`. Currently, the default value for `log_level` is `:info` for the production environment and `:debug` in all other environments. In Rails 5 the default value will be unified to `:debug` across all environments. To preserve the current setting, add the following line to your `production.rb`:

   config.log_level = :info

. (called from block in tsort_each at /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/tsort.rb:228)
Called 'load' without the :safe option -- defaulting to safe mode.
You can avoid this warning in the future by setting the SafeYAML::OPTIONS[:default_mode] option (to :safe or :unsafe).
== 20170628185847 JobsYamlToJson: migrating ===================================
== 20170628185847 JobsYamlToJson: migrated (73.9702s) =========================

 done.
Checking application.yml for completeness...Defaulting to memory cache, because /var/www/arvados-api/current/tmp/cache owner (uid=33) is not me (uid=0)
DEPRECATION WARNING: The configuration option `config.serve_static_assets` has been renamed to `config.serve_static_files` to clarify its role (it merely enables serving everything in the `public` folder and is unrelated to the asset pipeline). The `serve_static_assets` alias will be removed in Rails 5.0. Please migrate your configuration files accordingly. (called from block in <top (required)> at /var/www/arvados-api/current/config/environments/production.rb:12)
DEPRECATION WARNING: You did not specify a `log_level` in `production.rb`. Currently, the default value for `log_level` is `:info` for the production environment and `:debug` in all other environments. In Rails 5 the default value will be unified to `:debug` across all environments. To preserve the current setting, add the following line to your `production.rb`:

   config.log_level = :info

. (called from block in tsort_each at /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/tsort.rb:228)
Called 'load' without the :safe option -- defaulting to safe mode.
You can avoid this warning in the future by setting the SafeYAML::OPTIONS[:default_mode] option (to :safe or :unsafe).
AppVersion (discovered)          ad77601-8 
action_controller.perform_caching true
admin_notifier_email_from        support@curoverse.com
arvados_docsite                  https://doc.arvados.org
arvados_theme                    default
assets.compile                   false
assets.compress                  true
...

#15 Updated by Nico César 18 days ago

and qr1hi...

== 20170628185847 JobsYamlToJson: migrating ===================================
== 20170628185847 JobsYamlToJson: migrated (269.9086s) ========================

#16 Updated by Tom Clegg 5 days ago

  • Status changed from In Progress to Resolved

Also available in: Atom PDF