Bug #18690
closedsecret files get included in output collection when using tmp collection mount as output directory
Description
I have this tool that downloads a file from S3, with credentials.
class: CommandLineTool cwlVersion: v1.2 $namespaces: arv: "http://arvados.org/cwl#" cwltool: "http://commonwl.org/cwltool#" inputs: s3url: string aws_access_key_id: string aws_secret_access_key: string requirements: InlineJavascriptRequirement: {} NetworkAccess: networkAccess: true DockerRequirement: dockerPull: amazon/aws-cli InitialWorkDirRequirement: listing: - entryname: .aws/credentials entry: | [default] aws_access_key_id=$(inputs.aws_access_key_id) aws_secret_access_key=$(inputs.aws_secret_access_key) hints: cwltool:Secrets: secrets: [aws_access_key_id, aws_secret_access_key] arguments: ["s3", "cp", $(inputs.s3url), $(inputs.s3url.split('/').pop())] outputs: file: type: File outputBinding: glob: $(inputs.s3url.split('/').pop())
For output to a regular directory, the secret file is correctly removed/ignored from the upload.
However, since this is a downloading process, it is especially useful to stream into Keep using a collection mount as the output directory. I can add this requirement:
arv:RuntimeConstraints: outputDirType: keep_output_dir
This exposes the bug, with "keep_output_dir" the ".aws/credentials" file gets included in the output.
The "keep_output_dir" is implemented by setting a mount in crunch this way:
{ "mounts": { "/output": { "kind": "collection", "writable": true } } }
In crunch-run, secret mounts are are file literal types "text" or "json" which are written to the file system during container setup.
The "copier" which uploads files knows that it should ignore secrets.
Output to keep doesn't use copier.
Need to use some other strategy.
- First option, file needs to be removed from the collection manifest before it gets committed. Note: this isn't perfect. The secret is still probably getting written to a keep block, which means it will hang around a while before getting garbage collected. For a lot of cases, it would be preferable to avoid using files to store credentials at all, e.g. use secret environment instead #18689
- Second option, store the secret files somewhere outside the container, and bind mount them into place. This is likely to end up producing empty placeholder files in the collection (which ideally still need to be removed), but has the advantage that it doesn't leak the contents in keep blocks.
Related issues
Updated by Peter Amstutz almost 3 years ago
- Target version set to 2022-02-16 sprint
Updated by Peter Amstutz almost 3 years ago
- Related to Feature #18689: support secret_environment added
Updated by Tom Clegg almost 3 years ago
- Status changed from New to In Progress
18690-secret-files @ 01877c7b5119d29384490adceb862da30b4e81f5 --
Use an additional bind mount for a secret file when it's inside a fuse-mounted output collection.
Maintain old behavior (copy secrets into staging dir) when output dir is a staging dir.
Updated by Ward Vandewege almost 3 years ago
Tom Clegg wrote:
18690-secret-files @ 01877c7b5119d29384490adceb862da30b4e81f5 --
Use an additional bind mount for a secret file when it's inside a fuse-mounted output collection.
Maintain old behavior (copy secrets into staging dir) when output dir is a staging dir.
LGTM, thanks!
Updated by Tom Clegg almost 3 years ago
- Status changed from In Progress to Resolved
Applied in changeset arvados|0101431170b0ce317d665822326eac0c66cd2632.