Story #9277

[Crunch2] System-owned container outputs should be garbage-collected

Added by Peter Amstutz about 1 year ago. Updated 4 months ago.

Status:ResolvedStart date:02/16/2017
Priority:NormalDue date:
Assignee:Peter Amstutz% Done:

100%

Category:API
Target version:2017-03-01 sprint
Story points0.5Remaining (hours)0.00 hour
Velocity based estimate-
ReleaseCrunch v2

Description

Background

When a container finishes, crunch-run creates an output collection and records its PDH in the container record.

On the API server, the container update triggers a hook that creates one copy of the output collection for each of N container requests that have been waiting on this container.

The original output collection is owned by root and there is no cleanup process has trash_at set to now+defaultTrashLifetime.

Proposed fix

In crunch-run, create the output collection with trash_at=now.

In the API server hook that creates a collection for each relevant container request, when looking up the container output manifest, make sure to include trashed collections in the search.


Subtasks

Task #11107: Review 9277-trash-container-outputsResolvedRadhika Chippada

Task #11164: Review 9277-container-outputResolvedPeter Amstutz


Related issues

Related to Arvados - Story #9278: [Crunch2] Document/fix handling of collections with non-n... In Progress 06/01/2016
Related to Arvados - Feature #11100: [CWL] Intermediary collection handling can be specified Resolved 03/30/2017
Duplicated by Arvados - Feature #11064: [Crunch2] crunch-run expires output collection immediately Duplicate

Associated revisions

Revision 21598295
Added by Peter Amstutz 4 months ago

Merge branch '9277-trash-container-outputs' closes #9277

Revision cca78a11
Added by Peter Amstutz 4 months ago

Merge branch '9277-container-output' closes #9277

History

#1 Updated by Peter Amstutz about 1 year ago

  • Description updated (diff)

#2 Updated by Peter Amstutz 7 months ago

  • Subject changed from [Crunch2] Behavior for output collections to [Crunch2] Intermediate output collections have expiry set
  • Description updated (diff)

#3 Updated by Tom Clegg 5 months ago

  • Description updated (diff)
  • Category set to API

#4 Updated by Tom Clegg 5 months ago

  • Subject changed from [Crunch2] Intermediate output collections have expiry set to [Crunch2] System-owned container outputs should be garbage-collected

#5 Updated by Tom Morris 4 months ago

  • Target version set to 2017-03-01 sprint
  • Story points set to 0.5

#6 Updated by Peter Amstutz 4 months ago

  • Assignee set to Peter Amstutz

#7 Updated by Tom Clegg 4 months ago

  • Description updated (diff)

#8 Updated by Peter Amstutz 4 months ago

Even better than trash_at=now, it looks like we can just set is_trashed: true

9277-trash-container-outputs @ 72873affa7f249faa16d5d21200e935d27aea911

#9 Updated by Peter Amstutz 4 months ago

  • Status changed from New to In Progress

#10 Updated by Radhika Chippada 4 months ago

LGTM

#11 Updated by Peter Amstutz 4 months ago

  • Status changed from In Progress to Resolved
  • % Done changed from 0 to 100

Applied in changeset arvados|commit:21598295f38998d8028aaa117f192de6b5758808.

#12 Updated by Peter Amstutz 4 months ago

  • Status changed from Resolved to Feedback
  def validate_output
    # Output must exist and be readable by the current user.  This is so
    # that a container cannot "claim" a collection that it doesn't otherwise
    # have access to just by setting the output field to the collection PDH.
    if output_changed?
      c = Collection.
          readable_by(current_user).
          where(portable_data_hash: self.output).
          first
      if !c
        errors.add :output, "collection must exist and be readable by current user." 
      end
    end
  end

This also needs to be unscoped.

#13 Updated by Peter Amstutz 4 months ago

Additional fixes on 9277-container-output

Tests:

https://ci.curoverse.com/job/developer-run-tests/178/

#14 Updated by Radhika Chippada 4 months ago

It would be nice (if it doesn't already exist) to add a test such as "error setting output on running container if trashed collection is not readable". Thanks.

LGTM

#15 Updated by Peter Amstutz 4 months ago

  • Status changed from Feedback to Resolved
  • % Done changed from 50 to 100

Applied in changeset arvados|commit:cca78a112d6a2aa4f76c1956cfe8ec2d43a68759.

Also available in: Atom PDF