Project

General

Profile

Bug #18690

Updated by Peter Amstutz about 2 years ago

I have this tool that downloads a file from S3, with credentials. 

 <pre> 
 class: CommandLineTool 
 cwlVersion: v1.2 

 $namespaces: 
   arv: "http://arvados.org/cwl#" 
   cwltool: "http://commonwl.org/cwltool#" 

 inputs: 
   s3url: string 
   aws_access_key_id: string 
   aws_secret_access_key: string 

 requirements: 
   InlineJavascriptRequirement: {} 
   NetworkAccess: 
     networkAccess: true 
   DockerRequirement: 
     dockerPull: amazon/aws-cli 
   InitialWorkDirRequirement: 
     listing: 
       - entryname: .aws/credentials 
         entry: | 
           [default] 
           aws_access_key_id=$(inputs.aws_access_key_id) 
           aws_secret_access_key=$(inputs.aws_secret_access_key) 

 hints: 
   cwltool:Secrets: 
     secrets: [aws_access_key_id, aws_secret_access_key] 

 arguments: ["s3", "cp", $(inputs.s3url), $(inputs.s3url.split('/').pop())] 

 outputs: 
   file: 
     type: File 
     outputBinding: 
       glob: $(inputs.s3url.split('/').pop()) 
 </pre> 

 For output to a regular directory, the secret file is correctly removed/ignored from the upload. 

 However, since this is a downloading process, it is especially useful to stream into Keep using a collection mount as the output directory.    I can add this requirement: 

 <pre> 
 arv:RuntimeConstraints: 
     outputDirType: keep_output_dir 
 </pre> 

 This exposes the bug, with "keep_output_dir" the ".aws/credentials" file gets included in the output. 

 The "keep_output_dir" is implemented by setting a mount in crunch this way: 

 <pre> 
 { 
   "mounts": { 
     "/output": { 
       "kind": "collection", 
       "writable": true 
     } 
   } 
 } 
 </pre> 

 In crunch-run, secret mounts are are file literal types "text" or "json" which are written to the file system during container setup. 

 The "copier" which uploads files knows that it should ignore secrets. 

 Output to keep doesn't use copier. 

 Need to use some other strategy. 

 # First option, Instead, the file needs to be removed from the collection manifest before it gets committed.    

 Note: this isn't perfect.    The secret is still probably getting written to a keep block, which means it will hang around a while before getting garbage collected.    For a lot of cases, it would be preferable to avoid using files to store credentials at all, e.g. use secret environment instead #18689 
 # Second option, store the secret files somewhere outside the container, and bind mount them into place.    This is likely to end up producing empty placeholder files in the collection (which ideally still need to be removed), but has the advantage that it doesn't leak the contents in keep blocks. 

Back