Project

General

Profile

Actions

Idea #22680

open

Arvados manages persistent credentials to external resources and can provide them to a container

Added by Peter Amstutz 12 days ago. Updated 4 days ago.

Status:
New
Priority:
Normal
Assigned To:
-
Category:
Crunch
Target version:
Start date:
Due date:
Story points:
-

Description

We want containers with API access to be able to access AWS services when running on AWS. The driving use case is to enable tasks running within containers to natively access organizational S3 buckets (not the Arvados Keep buckets).

I believe what we want to do here is going to be something like this:

a) the compute node's AWS role includes permission to assume one or more other roles. those other roles have the AWS permissions that should be available to the job inside the container.

b) crunch-run decides if the user is permitted to take on one of these AWS roles (however, an initial version of this might be 1 role for any user on the cluster)

c) crunch-run calls AssumeRole to get credentials for the container's role and gets back SessionToken, SecretAccessKey, and AccessKeyId.

https://docs.aws.amazon.com/IAM/latest/UserGuide/id_credentials_temp_use-resources.html

d) The credentials assigned to the container are passed in when launching the container -- probably using environment variables

$ export AWS_ACCESS_KEY_ID=ASIAIOSFODNN7EXAMPLE
$ export AWS_SECRET_ACCESS_KEY=wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY
$ export AWS_SESSION_TOKEN=AQoDYXdzEJr...<remainder of session token>

Alternately, we could implement the container credential provider feature:

https://docs.aws.amazon.com/sdkref/latest/guide/feature-container-credentials.html

I think this is just a HTTP GET response with the same format as instance metadata (?) but since this is kind of an internal AWS API we'd have to do a little light reverse engineering to make sure we gave it the correct response. The advantage of this is that it might allow for longer lived credentials, as each time the endpoint is called by the container, crunch-run could make a new AssumeRole call to get refreshed credentials.

Ideas for managing permissions on external roles/secrets

We can use groups with new "group_class". This allows them to be included in the normal permission graph.

group_class: aws_iam_role

Represents an AWS IAM role.

The name is used by the system for lookup, e.g. "Container AWS IAM role"

Stores the specific role id in a property.

When crunch-run executes, checks if the user has "can_read" to [name="Container AWS IAM role", group_class="aws_iam_role"] and if so, provides the role to the container.

group_class: aws_credentials

Stores AWS_ACCESS_KEY and AWS_SECRET_KEY in properties.

AWS_SECRET_KEY should be hidden from responses.

When crunch-run executes, checks if the user has "can_read" to [name="Container AWS credentials", group_class="aws_credentials"] and if so, provides the credentials to the container.

Generalizing secrets with "group_class: credential"

Wondering if we can usefully generalize this.

This would have the following behavior:

Properties have a key called "type" which describes more precisely what it represents, e.g. "aws_iam_role", "aws_credentials" etc.

Properties have other keys for the non-secret parts of the identity, e.g. the username, the access key, etc. Which keys are expected to be present depends on "type".

Properties have a key called "secret" which is removed from GET responses. This is an object with additional fields. The fields found in "secret" depends on "type".

Updating the record with fields which are supposed to be in "secret" appear at the top level will throw an error.

There is a separate fetch_secret API call which returns the contents of "secret".

Questions: can secrets be owned by projects or do they have to be owned by the system user? Maybe they can only be owned by users?

Can fetch_secret be called any time, or only with a valid container token?

How does something like crunch-run resolve a secret? How does it decide what secret to look for, for a given container?

Alternately, do we want arvados-cwl-runner looking up the secret, and we don't touch crunch-run at all?

CWL support

I proposed this feature last year. It hasn't been implemented yet, but obviously supporting this in CWL requires appropriate Arvados support. The model is that a secret has a non-sensitive id part and a sensitive secret part.

https://github.com/common-workflow-language/cwl-v1.3/pull/26


Related issues 1 (1 open0 closed)

Related to Arvados - Feature #20650: a-c-r natively supports S3 inputs just like HTTP/SIn ProgressPeter Amstutz03/25/2025Actions
Actions #1

Updated by Peter Amstutz 12 days ago

  • Position changed from -940334 to -940330
Actions #2

Updated by Peter Amstutz 12 days ago

  • Description updated (diff)
Actions #3

Updated by Peter Amstutz 10 days ago

  • Description updated (diff)
Actions #4

Updated by Peter Amstutz 6 days ago

  • Description updated (diff)
Actions #5

Updated by Peter Amstutz 6 days ago

  • Description updated (diff)
Actions #6

Updated by Peter Amstutz 6 days ago

  • Related to Feature #20650: a-c-r natively supports S3 inputs just like HTTP/S added
Actions #7

Updated by Peter Amstutz 5 days ago

  • Description updated (diff)
Actions #8

Updated by Peter Amstutz 4 days ago

  • Description updated (diff)
Actions #9

Updated by Peter Amstutz 4 days ago

  • Description updated (diff)
Actions #10

Updated by Peter Amstutz 4 days ago

  • Description updated (diff)
Actions #11

Updated by Peter Amstutz 4 days ago

  • Subject changed from crunch-run on AWS can call AssumeRole and pass delegated credentials to a container to Arvados manages persistent credentials to external resources and can provide them to a container
Actions #12

Updated by Peter Amstutz 4 days ago

  • Description updated (diff)
Actions

Also available in: Atom PDF