Story #3411

[API] Implement Trash behavior using collection expiration

Added by Misha Zatsman about 6 years ago. Updated almost 6 years ago.

Status:
Resolved
Priority:
Normal
Assigned To:
Tim Pierce
Category:
-
Target version:
Start date:
07/29/2014
Due date:
% Done:

100%

Estimated time:
(Total: 13.00 h)
Story points:
1.0

Description

A collection can have two different kinds of permanence: ephemeral or persisted.

When a user creates a collection, they will need to specify which kind permanence they want.

Each collection has an expiration date but persisted collections have an expiration date of "never" or null or however it's easiest to represent.

There should be a systemwide setting for "ephemeral lifetime" and it should be set to two weeks by default. Changing this value to less than 24 hours should result in an error. The api server should return this value to callers when requested, because the data manager will need to know it before deleting blocks.

When a collection is created as ephemeral, its expiration date should be set to now + "ephemeral lifetime". Likewise, if a collection's permanence is switched from to ephemeral, its expiration date should be set to now + "ephemeral lifetime".

When the api server returns collections for any request, it should only return collections whose expiration date is in the future or never. The api server should support a flag in requests that will enable it to also return expired collections.

For more information on Ephemeral Collections, see:

https://arvados.org/projects/orvos-private/wiki/Keep_Design_Doc#Persisting-Data

Implementation notes

First branch:
  • API server collections table column: delete_at (default is null) expires_at column got added in #3036
  • API server configuration setting: default_trash_lifetime
  • API server discovery document entry: defaultTrashLifetime
  • API server implicitly selects expires_at is null or expires_at > now when querying collections table (rails "default scope" can do this)
Next branch [fell off sprint, moved to #3900]:
  • Workbench "trash" button on collections that sets expires_at to now + default_trash_lifetime (default retrieved in discovery document).
  • Workbench "trash" tab in project view that shows trashed collections (with "un-trash" button)

See also #3150


Subtasks

Task #3739: Add default_trash_lifetimeResolvedTim Pierce

Task #3741: API server filters for trashed objectsResolvedTim Pierce

Task #3883: Review 3411-expire-collectionsResolvedTom Clegg


Related issues

Related to Arvados - Feature #3900: [Workbench] Trash button on collection uses "delete" API instead of setting expires_at/trash_atResolved02/10/2017

Related to Arvados - Story #3150: [API] Add "trash" behavior for all object types, instead of literal rdbms "delete"Rejected07/04/2014

Blocks Arvados - Story #3408: [Keep] Implement Production Data ManagerResolved07/29/2014

Associated revisions

Revision 517d3fca
Added by Tim Pierce almost 6 years ago

Merge branch '3411-expire-collections'

Closes #3411.

Revision 21cdd100
Added by Tim Pierce almost 6 years ago

Merge branch '3411-expire-collections'

Refs #3411.

History

#1 Updated by Ward Vandewege about 6 years ago

  • Target version set to 2014-08-27 Sprint

#2 Updated by Ward Vandewege about 6 years ago

  • Subject changed from Collection expiration to [API] Collection expiration

#3 Updated by Ward Vandewege about 6 years ago

  • Story points set to 3.0

#4 Updated by Peter Amstutz about 6 years ago

  • Assigned To set to Peter Amstutz

#5 Updated by Peter Amstutz about 6 years ago

  • Story points changed from 3.0 to 1.0

#6 Updated by Peter Amstutz about 6 years ago

  • Story points changed from 1.0 to 3.0

Needs to be exhaustively tested

#7 Updated by Tom Clegg about 6 years ago

  • Target version changed from 2014-08-27 Sprint to Arvados Future Sprints

#8 Updated by Peter Amstutz about 6 years ago

  • Assigned To deleted (Peter Amstutz)

#9 Updated by Tom Clegg about 6 years ago

  • Subject changed from [API] Collection expiration to [API] Implement Trash behavior using collection expiration
  • Description updated (diff)
  • Story points changed from 3.0 to 1.0

#10 Updated by Ward Vandewege almost 6 years ago

  • Target version changed from Arvados Future Sprints to 2014-09-17 sprint

#11 Updated by Tim Pierce almost 6 years ago

  • Assigned To set to Tim Pierce

#12 Updated by Tim Pierce almost 6 years ago

  • Story points changed from 1.0 to 2.0

#13 Updated by Tom Clegg almost 6 years ago

  • Description updated (diff)

#14 Updated by Tom Clegg almost 6 years ago

  • Description updated (diff)

#15 Updated by Tom Clegg almost 6 years ago

In services/api/app/models/collection.rb
  • I think the lambda style is superfluous here since the where() doesn't change over time -- i.e., it would work equally well to do this?
    • default_scope where("expires_at IS NULL or expires_at > CURRENT_TIMESTAMP")
In services/api/test/fixtures/collections.yml
  • We should get into the habit of using a user or project ID as owner_uuid of a collection (the way collections work in daily life now) instead of using owner_uuid=root and a permission link (the way they used to work in daily life before #3036).
In services/api/test/functional/arvados/v1/collections_controller_test.rb
  • Do these tests really need permit_unsigned_manifests?
  • Should also test whether not-yet-expired and expired collections can be retrieved with get :show, { uuid: blah } (should and shouldn't, respectively)
  • Ditto calling "update" on a not-yet-expired collection (should work), on an expired collection (should 404)

I don't see where we do "Changing this value to less than 24 hours should result in an error" but I'm also not sure about the best place to do that. Perhaps in services/api/lib/tasks/config_check.rake? (That gets run during deploy scripts, so config errors can be caught before even trying to start the application.)

#16 Updated by Tim Pierce almost 6 years ago

Ready for another look at 9d18764

Tom Clegg wrote:

In services/api/app/models/collection.rb
  • I think the lambda style is superfluous here since the where() doesn't change over time -- i.e., it would work equally well to do this?
    • [...]

Yes, you're right. (The lambda was left over from doing #{Time.now} until I realized we could just use CURRENT_TIMESTAMP)

In services/api/test/fixtures/collections.yml
  • We should get into the habit of using a user or project ID as owner_uuid of a collection (the way collections work in daily life now) instead of using owner_uuid=root and a permission link (the way they used to work in daily life before #3036).

Fixed, and dropped the unnecessary permission links. That's much much nicer.

In services/api/test/functional/arvados/v1/collections_controller_test.rb
  • Do these tests really need permit_unsigned_manifests?

No, turns out they don't. That was me trying to be conservative about the test conditions.

  • Should also test whether not-yet-expired and expired collections can be retrieved with get :show, { uuid: blah } (should and shouldn't, respectively)
  • Ditto calling "update" on a not-yet-expired collection (should work), on an expired collection (should 404)

Added test clauses for these.

I don't see where we do "Changing this value to less than 24 hours should result in an error" but I'm also not sure about the best place to do that. Perhaps in services/api/lib/tasks/config_check.rake? (That gets run during deploy scripts, so config errors can be caught before even trying to start the application.)

Ah, I forgot that completely. Thanks for your help on this: rake config:check now raises an exception if default_trash_lifetime is set to anything less than 86400.

#17 Updated by Tom Clegg almost 6 years ago

Tim Pierce wrote:

  • Should also test whether not-yet-expired and expired collections can be retrieved with get :show, { uuid: blah } (should and shouldn't, respectively)
  • Ditto calling "update" on a not-yet-expired collection (should work), on an expired collection (should 404)

Added test clauses for these.

I think these tests need to be written as separate test cases. (Functional tests reuse the controller object if you call multiple actions from one test case, which is not what happens in real life and is very likely to do/test the wrong thing.)

Ah, I forgot that completely. Thanks for your help on this: rake config:check now raises an exception if default_trash_lifetime is set to anything less than 86400.

I think it would be nicer to call Rails.configuration.default_trash_lifetime instead of $application_config['...'] here.

Everything else looks good. Thanks.

#18 Updated by Tim Pierce almost 6 years ago

New revision at 0a5f8b6

Tom Clegg wrote:

Tim Pierce wrote:

  • Should also test whether not-yet-expired and expired collections can be retrieved with get :show, { uuid: blah } (should and shouldn't, respectively)
  • Ditto calling "update" on a not-yet-expired collection (should work), on an expired collection (should 404)

Added test clauses for these.

I think these tests need to be written as separate test cases. (Functional tests reuse the controller object if you call multiple actions from one test case, which is not what happens in real life and is very likely to do/test the wrong thing.)

That's useful to know -- I was never sure where the appropriate boundary was for functional tests.

Ah, I forgot that completely. Thanks for your help on this: rake config:check now raises an exception if default_trash_lifetime is set to anything less than 86400.

I think it would be nicer to call Rails.configuration.default_trash_lifetime instead of $application_config['...'] here.

OK.

#19 Updated by Tom Clegg almost 6 years ago

LGTM, thanks!

#20 Updated by Tim Pierce almost 6 years ago

  • Status changed from New to Resolved
  • % Done changed from 57 to 100

Applied in changeset arvados|commit:517d3fca54225873d36f94083f3b7056ce271f46.

#21 Updated by Tom Clegg almost 6 years ago

  • Description updated (diff)
  • Story points changed from 2.0 to 1.0

Also available in: Atom PDF