Story #11908

Migrate Collections.properties to JSONB

Added by Tom Morris 7 months ago. Updated about 1 month ago.

Status:
Resolved
Priority:
Normal
Assigned To:
Category:
API
Target version:
Start date:
06/27/2017
Due date:
% Done:

100%

Estimated time:
(Total: 0.00 h)
Story points:
0.5

Description

Handle (deserialize) both YAML and JSON and convert column type to JSONB


Subtasks

Task #11933: Review 11908-properties-column-jsonResolved


Related issues

Related to Arvados - Story #11884: Convert Collection properties column type to JSONBDuplicate

Blocked by Arvados - Story #11807: [API] Migrate old serialized database content from YAML to JSONResolved2017-06-05

Blocks Arvados - Story #4019: [API] Support query of "properties" field on objectsResolved2017-12-12

Associated revisions

Revision 8f987a92
Added by Tom Clegg about 1 month ago

Merge branch '11908-properties-column-json'

refs #11908

Arvados-DCO-1.1-Signed-off-by: Tom Clegg <>

History

#1 Updated by Tom Morris 7 months ago

  • Description updated (diff)
  • Target version set to Arvados Future Sprints
  • Story points set to 2.0

#2 Updated by Tom Morris 7 months ago

  • Target version changed from Arvados Future Sprints to 2017-07-19 sprint

#3 Updated by Tom Morris 7 months ago

  • Assigned To set to Tom Clegg

#4 Updated by Tom Clegg 7 months ago

  • Status changed from New to In Progress

Changing the column type in-place is trivial ("alter table foo alter column bar type jsonb using bar::jsonb"). Unfortunately, the fulltext index includes the old column so it gets dropped in the process, and has to be rebuilt.

The good news is that I ran into the fulltext index migration bug (existing index not detected, migration fails) and fixed it.

ActiveRecord::StatementInvalid: PG::DuplicateTable: ERROR:  relation "collections_full_text_search_idx" already exists

11908-properties-column-json @ bb821d03eb10ddcc7822fac51a565d1a11082ebc

#5 Updated by Radhika Chippada 7 months ago

  • Do we not need to drop the full text index on collection before changing column type back in down migration?
  • Do we not want to update workflows -> definition column?

#6 Updated by Tom Clegg 6 months ago

Radhika Chippada wrote:

  • Do we not need to drop the full text index on collection before changing column type back in down migration?

On my system the down-migration worked without dropping/recreating the index. I added a comment to the up-migration with the postgresql error it avoids.

  • Do we not want to update workflows -> definition column?

Not here/now (this is just collections.properties, to support tags) and perhaps not ever (IIRC we decided to store literal YAML there, instead of using a serialized field, in order to preserve key order, formatting, comments).

11908-properties-column-json @ ebc65675cecdf25ca11a86f789bfb23b600875b8

#7 Updated by Radhika Chippada 6 months ago

On my system the down-migration worked without dropping/recreating the index. I added a comment to the up-migration with the postgresql error it avoids.

Down migration worked for me as well. I also switched to master branch after down migration and added tags to a collection and everything worked fine.

LGTM

#8 Updated by Tom Clegg 6 months ago

  • Category set to API
  • Target version changed from 2017-07-19 sprint to Arvados Future Sprints
  • Story points changed from 2.0 to 0.5

Merge is blocked on PostgreSQL 9.4 dependency.

#9 Updated by Tom Morris 3 months ago

  • Target version changed from Arvados Future Sprints to 2017-11-08 Sprint

#10 Updated by Tom Clegg 2 months ago

  • Target version changed from 2017-11-08 Sprint to 2017-11-22 Sprint

#11 Updated by Tom Clegg 2 months ago

As a reminder, in July #11807 did (what we expect to be) the slowest part of the yaml-to-json migration, i.e., the jobs table, and that took 3.5 minutes on qr1hi.

This branch does a smaller part of the yaml-to-json migration. It also changes the column type of a single column (collections.properties) that is typically null in most rows, and then it has to regenerate the fulltext index. We don't having timing estimates for that yet.

#12 Updated by Tom Clegg about 2 months ago

  • Target version changed from 2017-11-22 Sprint to 2017-12-06 Sprint

#13 Updated by Tom Clegg about 1 month ago

  • Target version changed from 2017-12-06 Sprint to 2017-12-20 Sprint

#14 Updated by Peter Amstutz about 1 month ago

  • Related to deleted (Story #4019: [API] Support query of "properties" field on objects)

#15 Updated by Peter Amstutz about 1 month ago

  • Blocks Story #4019: [API] Support query of "properties" field on objects added

#16 Updated by Tom Clegg about 1 month ago

Rebased.

11908-properties-column-json @ 2bd1ff1d4885786bf3ed838bcefb7198fcab6bc7

#17 Updated by Tom Clegg about 1 month ago

  • Status changed from In Progress to Resolved

Also available in: Atom PDF