Project

General

Profile

Actions

Idea #11908

closed

Migrate Collections.properties to JSONB

Added by Tom Morris over 7 years ago. Updated about 7 years ago.

Status:
Resolved
Priority:
Normal
Assigned To:
Category:
API
Target version:
Start date:
06/27/2017
Due date:
Story points:
0.5

Description

Handle (deserialize) both YAML and JSON and convert column type to JSONB


Subtasks 1 (0 open1 closed)

Task #11933: Review 11908-properties-column-jsonResolved06/27/2017Actions

Related issues 3 (0 open3 closed)

Related to Arvados - Idea #11884: Convert Collection properties column type to JSONBDuplicateActions
Blocked by Arvados - Idea #11807: [API] Migrate old serialized database content from YAML to JSONResolvedTom Clegg06/05/2017Actions
Blocks Arvados - Idea #4019: [API] Support query of "properties" field on objectsResolvedPeter Amstutz12/12/2017Actions
Actions #1

Updated by Tom Morris over 7 years ago

  • Description updated (diff)
  • Target version set to Arvados Future Sprints
  • Story points set to 2.0
Actions #2

Updated by Tom Morris over 7 years ago

  • Target version changed from Arvados Future Sprints to 2017-07-19 sprint
Actions #3

Updated by Tom Morris over 7 years ago

  • Assigned To set to Tom Clegg
Actions #4

Updated by Tom Clegg over 7 years ago

  • Status changed from New to In Progress

Changing the column type in-place is trivial ("alter table foo alter column bar type jsonb using bar::jsonb"). Unfortunately, the fulltext index includes the old column so it gets dropped in the process, and has to be rebuilt.

The good news is that I ran into the fulltext index migration bug (existing index not detected, migration fails) and fixed it.

ActiveRecord::StatementInvalid: PG::DuplicateTable: ERROR:  relation "collections_full_text_search_idx" already exists

11908-properties-column-json @ bb821d03eb10ddcc7822fac51a565d1a11082ebc

Actions #5

Updated by Radhika Chippada over 7 years ago

  • Do we not need to drop the full text index on collection before changing column type back in down migration?
  • Do we not want to update workflows -> definition column?
Actions #6

Updated by Tom Clegg over 7 years ago

Radhika Chippada wrote:

  • Do we not need to drop the full text index on collection before changing column type back in down migration?

On my system the down-migration worked without dropping/recreating the index. I added a comment to the up-migration with the postgresql error it avoids.

  • Do we not want to update workflows -> definition column?

Not here/now (this is just collections.properties, to support tags) and perhaps not ever (IIRC we decided to store literal YAML there, instead of using a serialized field, in order to preserve key order, formatting, comments).

11908-properties-column-json @ ebc65675cecdf25ca11a86f789bfb23b600875b8

Actions #7

Updated by Radhika Chippada over 7 years ago

On my system the down-migration worked without dropping/recreating the index. I added a comment to the up-migration with the postgresql error it avoids.

Down migration worked for me as well. I also switched to master branch after down migration and added tags to a collection and everything worked fine.

LGTM

Actions #8

Updated by Tom Clegg over 7 years ago

  • Category set to API
  • Target version changed from 2017-07-19 sprint to Arvados Future Sprints
  • Story points changed from 2.0 to 0.5

Merge is blocked on PostgreSQL 9.4 dependency.

Actions #9

Updated by Tom Morris about 7 years ago

  • Target version changed from Arvados Future Sprints to 2017-11-08 Sprint
Actions #10

Updated by Tom Clegg about 7 years ago

  • Target version changed from 2017-11-08 Sprint to 2017-11-22 Sprint
Actions #11

Updated by Tom Clegg about 7 years ago

As a reminder, in July #11807 did (what we expect to be) the slowest part of the yaml-to-json migration, i.e., the jobs table, and that took 3.5 minutes on qr1hi.

This branch does a smaller part of the yaml-to-json migration. It also changes the column type of a single column (collections.properties) that is typically null in most rows, and then it has to regenerate the fulltext index. We don't having timing estimates for that yet.

Actions #12

Updated by Tom Clegg about 7 years ago

  • Target version changed from 2017-11-22 Sprint to 2017-12-06 Sprint
Actions #13

Updated by Tom Clegg about 7 years ago

  • Target version changed from 2017-12-06 Sprint to 2017-12-20 Sprint
Actions #14

Updated by Peter Amstutz about 7 years ago

  • Related to deleted (Idea #4019: [API] Support query of "properties" field on objects)
Actions #15

Updated by Peter Amstutz about 7 years ago

  • Blocks Idea #4019: [API] Support query of "properties" field on objects added
Actions #16

Updated by Tom Clegg about 7 years ago

Rebased.

11908-properties-column-json @ 2bd1ff1d4885786bf3ed838bcefb7198fcab6bc7

Actions #17

Updated by Tom Clegg about 7 years ago

  • Status changed from In Progress to Resolved
Actions

Also available in: Atom PDF