Project

General

Profile

Actions

Bug #18690

closed

secret files get included in output collection when using tmp collection mount as output directory

Added by Peter Amstutz about 2 years ago. Updated about 2 years ago.

Status:
Resolved
Priority:
Normal
Assigned To:
Category:
Crunch
Target version:
Story points:
-
Release relationship:
Auto

Description

I have this tool that downloads a file from S3, with credentials.

class: CommandLineTool
cwlVersion: v1.2

$namespaces:
  arv: "http://arvados.org/cwl#" 
  cwltool: "http://commonwl.org/cwltool#" 

inputs:
  s3url: string
  aws_access_key_id: string
  aws_secret_access_key: string

requirements:
  InlineJavascriptRequirement: {}
  NetworkAccess:
    networkAccess: true
  DockerRequirement:
    dockerPull: amazon/aws-cli
  InitialWorkDirRequirement:
    listing:
      - entryname: .aws/credentials
        entry: |
          [default]
          aws_access_key_id=$(inputs.aws_access_key_id)
          aws_secret_access_key=$(inputs.aws_secret_access_key)

hints:
  cwltool:Secrets:
    secrets: [aws_access_key_id, aws_secret_access_key]

arguments: ["s3", "cp", $(inputs.s3url), $(inputs.s3url.split('/').pop())]

outputs:
  file:
    type: File
    outputBinding:
      glob: $(inputs.s3url.split('/').pop())

For output to a regular directory, the secret file is correctly removed/ignored from the upload.

However, since this is a downloading process, it is especially useful to stream into Keep using a collection mount as the output directory. I can add this requirement:

arv:RuntimeConstraints:
    outputDirType: keep_output_dir

This exposes the bug, with "keep_output_dir" the ".aws/credentials" file gets included in the output.

The "keep_output_dir" is implemented by setting a mount in crunch this way:

{
  "mounts": {
    "/output": {
      "kind": "collection",
      "writable": true
    }
  }
}

In crunch-run, secret mounts are are file literal types "text" or "json" which are written to the file system during container setup.

The "copier" which uploads files knows that it should ignore secrets.

Output to keep doesn't use copier.

Need to use some other strategy.

  1. First option, file needs to be removed from the collection manifest before it gets committed. Note: this isn't perfect. The secret is still probably getting written to a keep block, which means it will hang around a while before getting garbage collected. For a lot of cases, it would be preferable to avoid using files to store credentials at all, e.g. use secret environment instead #18689
  2. Second option, store the secret files somewhere outside the container, and bind mount them into place. This is likely to end up producing empty placeholder files in the collection (which ideally still need to be removed), but has the advantage that it doesn't leak the contents in keep blocks.

Subtasks 1 (0 open1 closed)

Task #18710: Review 18690-secret-filesResolvedWard Vandewege02/11/2022Actions

Related issues

Related to Arvados - Feature #18689: support secret_environmentNewActions
Actions #1

Updated by Peter Amstutz about 2 years ago

  • Target version set to 2022-02-16 sprint
Actions #2

Updated by Peter Amstutz about 2 years ago

Actions #3

Updated by Peter Amstutz about 2 years ago

  • Description updated (diff)
Actions #4

Updated by Peter Amstutz about 2 years ago

  • Description updated (diff)
Actions #5

Updated by Peter Amstutz about 2 years ago

  • Release set to 46
Actions #6

Updated by Peter Amstutz about 2 years ago

  • Assigned To set to Tom Clegg
Actions #7

Updated by Tom Clegg about 2 years ago

  • Status changed from New to In Progress

18690-secret-files @ 01877c7b5119d29384490adceb862da30b4e81f5 --

Use an additional bind mount for a secret file when it's inside a fuse-mounted output collection.

Maintain old behavior (copy secrets into staging dir) when output dir is a staging dir.

Actions #8

Updated by Ward Vandewege about 2 years ago

Tom Clegg wrote:

18690-secret-files @ 01877c7b5119d29384490adceb862da30b4e81f5 --

Use an additional bind mount for a secret file when it's inside a fuse-mounted output collection.

Maintain old behavior (copy secrets into staging dir) when output dir is a staging dir.

LGTM, thanks!

Actions #9

Updated by Tom Clegg about 2 years ago

  • Status changed from In Progress to Resolved
Actions #10

Updated by Peter Amstutz about 2 years ago

  • Release changed from 46 to 49
Actions

Also available in: Atom PDF