Collection version history

(draft)

Background: desired API features (#13494, #13561) epic ticket is #13109
  • Get current + previous versions of a collection with a single API call
  • Search all current + previous versions of collections with a single API call
  • Stable small numeric "version number" for each version of a given collection

Design

  • Store all versions in the collection table (much easier to do paging)
  • New database column "current_version_uuid" for latest version this the same as "uuid"
  • New database column indicates user-friendly version number (starting at 1) -- can be assigned safely after locking "current" row for update
  • Assign version number to the current version -- this way {uuid,version} can be used as a permanent reference to a specific version
  • Flag "include_past_versions" for "list" API to include old versions (otherwise, filter on current_version_uuid=uuid -- might need to adjust indexes to maintain performance)
  • Flag in site configuration to enable preserving old versions in database
  • Update owner_uuid of all old versions (in a transaction) when it changes in the current version
  • A permission link to the current version only implies permission to the most recent version (not the history)
  • Prohibit permission links to old versions (should be handled by check "can only make links to things that are visible to you", which should not include old versions)
  • Old versions may not be modified except for certain fields which must be synchronized with "current" version changes (owner_uuid, trash_at, delete_at). Updating owner_uuid, trash_at, delete_at does not update modified_at so modified_at shows when the version was created
  • Old versions with duplicate "name" does not conflict with (owner_uuid, name) uniqueness constraint
  • The following changes introduce a new version: manifest_text (portable_data_hash), description, properties, name
  • The following changes do not introduce a new version, and are copied to all past version records: replication_*, storage_classes_*, trash_at/delete_at/is_trashed, owner_uuid, uuid (update current_uuid to retain database consistency)
  • New collection operation "truncate history" which deletes all versions older than T (where T is a date or a version number?)
  • New database column "preserve_version" defaults to false, and can be set to true by a create or update API call. If an update request does not result in a new version, and preserve_version==true before the update, then preserve_version==true after the update (regardless of any value given in the update request).
  • New configuration option "preserve_version_if_idle": 0=immediate, 600=10 minutes, -1=never.
  • If the existing version is more than N (site-configurable) seconds old, or has preserve_version==true, then it is retained in the version history.

Example: given preserve_version_if_idle=600:

time operation versions in DB comment
1 create collection current
10000 update manifest_text 1, current
10001 update manifest_text 1, current
10002 update preserve_version=true 1, current
10003 update replication_desired=1 1, current
10004 update manifest_text 1, 2, current
10005 update manifest_text 1, 2, current
10006 update manifest_text, preserve_version=true 1, 2, current
10007 update manifest_text, preserve_version=true 1, 2, 3, current
10007 update preserve_version=false 1, 2, 3, current ignored; preserve_version is still true
10008 update manifest_text 1, 2, 3, 4, current
10009 update manifest_text 1, 2, 3, 4, current
20000 update manifest_text 1, 2, 3, 4, 5, current