Project

General

Profile

Cluster configuration » History » Revision 27

Revision 26 (Peter Amstutz, 02/06/2019 02:37 PM) → Revision 27/33 (Tom Clegg, 04/24/2019 01:00 PM)

h1. Cluster configuration 

 We are (2019) consolidating configuration from per-microservice yaml/json/ini files into a single cluster configuration document that is used by all components. 
 * Long term: system nodes automatically keep their configs synchronized (using something like consul). 
 * Short term: sysadmin uses tools like puppet and terraform to ensure /etc/arvados/config.yml is identical on all system nodes. 
 * Hosts without config files (e.g., hosts outside the cluster) can retrieve the config document from the API server. 

 h2. Discovery document 

 Previously, we copied selected config values from the API server config into the API discovery document so clients could see them. When clients can get the configuration document itself, this won't be needed. The discovery document should advertise APIs provided by the server, not cluster configuration. 

 h2. Secrets 

 Secrets like BlobSigningKey can be given literally in the config file (convenient for dev/test, consul-template, etc) or indirectly using a secret backend. Anticipated backends: 
 * <code class="yaml">BlobSigningKey: foobar</code> &rArr; the secret is literally <code>foobar</code> 
 * <code class="yaml">BlobSigningKey: "vault:foobar"</code> &rArr; the secret can be obtained from vault using the vault key "foobar" 
 * <code class="yaml">BlobSigningKey: "file:/foobar"</code> &rArr; the secret can be read from the local file @/foobar@ 
 * <code class="yaml">BlobSigningKey: "env:FOOBAR"</code> &rArr; the secret can be read from the environment variable @FOOBAR@ 

 h2. Instructions for ops 

 Tentative instructions for switching config file format/location: 
 # Upgrade Arvados to a version that supports loading all configs from the new cluster-wide config file (maybe 1.4). When services come back up, they will still use your old configuration files, but they will log some deprecation warnings. 
 # Migrate your configuration to the new config file, one component at a time. For each component: 
 ## Restart the component. 
 ## Inspect the deprecation warning that is logged at startup. It will tell you either "old config file is superfluous" or "new config file is incomplete". 
 ## If your old config file is superfluous, delete it. You're done. 
 ## Run the component with the "--config-diff" flag. This suggests changes to your new config file which will make your old config file obsolete. (Alternatively, run the component with the "--config-dump" flag. This outputs a new config file that would make your old config file obsolete. Saving this might be easier than applying a diff, but it will reorder keys and lose comments.) 
 ## Make the suggested changes. 
 ## Repeat until finished. 
 # Upgrade to a version that doesn't support old config files at all (maybe 1.5). 


 h2. Implementation 

 Development strategy for facilitating the above ops instructions: 
 # Read the new config file into an internal struct, if the new config file exists. 
 # Copy old config file values into the new config struct. 
 # Use the new config struct internally (the old config is no longer referenced except in the load-and-copy-to-new-struct step). 
 # Add a mechanism for showing the effect of the old config file on the resulting config struct (see "--config-diff" above). 
 # At startup, if the old config has any effect (i.e., some parts haven't been migrated to the new config file by the operator), log a deprecation warning recommending "--config-diff" and RTFM. 
 # Wait one minor version release cycle. 
 # Error out if the new config file does not exist. 
 # Error out if the old config file exists (...and some parts of the old config are not redundant [optional?]). 

 


 h2. Example/template Example config file 

 See also [[Config migration key mapping]] 

 (Format not yet frozen!) 

 Notes: 
 * Keys are CamelCase &mdash; except in special cases like PostgreSQL connection settings, which are passed through to another system without being interpreted by Arvados. 
 * Arrays and lists are not permitted. These cannot be expressed natively in consul, and tend to be troublesome anyway: "what changed?" is harder to answer usefully, significance of duplicate elements is unclear, etc. 

 <pre><code class="yaml"> 
 Clusters: 
   xyzzy:                       # api-server/uuid_prefix, sso/uuid_prefix 
     SystemRootToken:           # arvados-git-sync.rb/arvados_api_token, keepstore/SystemAuthTokenFile, c-d-s/AuthToken 
     ManagementToken:           # {arvados-ws,keepstore,keepproxy,keep-balance}/ManagementToken (& others) eec1999ccb6d75840a2c09bc70b6d3cbc990744e 
     Services: BlobSigningKey: ungu355able 
     BlobSignatureTTL: 172800 
     SessionKey: 186005aa54cab1ca95a3738e6e954e0a35a96d3d13a8ea541f4156e8d067b4f3 
     PostgreSQL: 
       RailsAPI: 
         InternalURLs: 
           "http://zzzzz:8000/": {}              ConnectionPool: 32 # api-server/(protocol,host,port) 
         ExternalURL: “https://zzzzz.arvadosapi.com/" 
         Insecure: false max concurrent connections per arvados server daemon 
       GitHTTP: 
         InternalURLs: 
           "http://git:9001/": {} 
         ExternalURL: "https://git.zzzzz.arvadosapi.com/" # api-server/git_repo_https_base 
       Keepstore: 
         InternalURLs: 
           "http://keep0:25107/": {Unlisted: true} 
           "http://keep1:25107/": {Debug: true} 
       Controller: 
         InternalURLs: 
           "http://zzzzz:9004/": {}                         # controller/NodeProfiles.$cluster.Controller.Listen 
         ExternalURL: "https://zzzzz.arvadosapi.com/"       # composer/apiEndPoint, workbench2/API_HOST, workbench/arvados_{login,v1}_base, arvados-ws/Client, keepproxy/Client 
       Websocket: 
         InternalURLs: 
           "http://ws:9003/": {}                            # arvados-ws/Listen 
         ExternalURL: "https://ws.zzzzz.arvadosapi.com/"    # api-server/websocket_address 
       Keepbalance: 
         InternalURLs: 
           "http://zzzzz:9005": {}                          # keepbalance/Listen 
       GitHTTP: 
         InternalURLs: 
           "http://zzzzz:9001": {}                          # arvados-git-httpd/Listen 
         ExternalURL: "https://git.zzzzz.arvadosapi.com/" # api-server/git_repo_https_base 
       GitSSH: 
         ExternalURL: "git@git.zzzzz.arvadosapi.com"        # api-server/git_repo_ssh_base 
       DispatchCloud: 
         InternalURLs: 
           "http://zzzzz:9006": {}                          # a-d-c/NodeProfiles 
       SSO: 
         ExternalURL: "https://auth.zzzzz.arvadosapi.com/"     # api-server/sso_provider_url 
       Keepproxy: 
         InternalURLs: 
           "http://keep:25107/": {}                 # keepproxy/Listen 
         ExternalURL: "https://keep.zzzzz.arvadosapi.com/" 
       WebDAV: 
         InternalURLs: 
           "http://keep:9002/": {}     # keep-web/Listen 
         ExternalURL: "https://*.collections.zzzzz.arvadosapi.com/" # api-server/keep_web_service_url, workbench/keep_web_url 
       WebDAVDownload: 
         InternalURLs: 
           "http://keep:9002/": {}     # keep-web/Listen 
           ExternalURL: "https://download.zzzzz.arvadosapi.com/" # keep-web/AttachmentOnlyHost, workbench/keep_web_download_url 
       Keepstore: 
         InternalURLs: 
           "https://keep0:25107/": {}                              # keepstore/Listen 
           "https://keep1:25107/": {}                              # keepstore/Listen 
       Composer: 
         ExternalURL: "http://composer.zzzzz.arvadosapi.com/"    # workbench/composer_url 
       WebShell: 
         ExternalURL: "http://webshell.zzzzz.arvadosapi.com/"    # workbench/shell_in_a_box_url 
       Workbench1: 
         InternalURLs: 
           "http://workbench:9000": {}                                 # workbench/Nginx.server.listen 
         ExternalURL: "http://workbench.zzzzz.arvadosapi.com/" # workbench/Nginx.server.listen, api-server/workbench_address 
       Workbench2: 
         ExternalURL: "http://workbench2.zzzzz.arvadosapi.com/" # workbench/workbench2_url 
     PostgreSQL: 
       Connection:                          # arvados-ws/Postgres, controller/PostgreSQL.Connection 
         # All parameters here are passed to the PG client library in a connection string; 
         # see https://www.postgresql.org/docs/current/static/libpq-connect.html#LIBPQ-PARAMKEYWORDS 
         Host: localhost 
         Port: 5432 
         User: arvados 
         Password: s3cr3t 
         DBName: arvados_production 
         client_encoding: utf8 
         fallback_application_name: arvados 
       ConnectionPool:                      # arvados-ws/PostgresPool 
     TLS: HTTPRequestTimeout: 5m 
     Defaults: 
       Certificate:                         # (literal, file, or acme dir) keepstore/TLSCertificateFile CollectionReplication: 2 
       Key:                                 # (literal, file, or acme dir) keepstore/TLSKeyFile TrashLifetime: 2w 
     UserActivation: 
       Insecure: ActivateNewUsers: true                       # workbench/arvados_insecure_https, api-server/sso_insecure 
     Git: 
       GitoliteAdminRepo:         # arvados-git-sync.rb/gitolite_url AutoAdminUser: root@example.com 
       GitoliteAdminPublicKey:    # arvados-git-sync.rb/gitolite_arvados_git_user_key UserProfileNotificationAddress: notify@example.com 
       GitoliteSyncWorkDir:       # arvados-git-sync.rb/gitolite_tmp NewUserNotificationRecipients: {} 
       GitCommand:                # arv-git-httpd/GitCommand 
       GitoliteHome:              # arv-git-httpd/GitoliteHome 
       Repositories:              # api-server/git_repositories_dir (crunch1 only; just assume {GitoliteHome}/repositories?) NewInactiveUserNotificationRecipients: {} 
     API: RequestLimits: 
       DisabledAPIs:                       # api-server/disable_api_methods MaxRequestLogParamsSize: 2KB 
       WebsocketKeepaliveTimeout:          # arvados-ws/PingTimeout 
       WebsocketClientEventQueue:          # arvados-ws/ClientEventQueue 
       WebsocketServerEventQueue:          # arvados-ws/ServerEventQueue 
       KeepServiceRequestTimeout:          # keepproxy/Timeout 
       MaxMemoryBuffers:                   # keepstore/MaxBuffers 
       MaxConcurrentRequests:              # keepstore/MaxRequests 
       MaxRequestSize:                     # api-server/max_request_size 128MiB 
       MaxIndexDatabaseRead:               # api-server/max_index_database_read 128MiB 
       MaxItemsPerResponse:                # api-server/max_items_per_response, keep-balance/CollectionBatchSize, keep-balance/CollectionBuffers 1000 
       MaxRequestAmplification:            # controller/RequestLimits.MultiClusterRequestConcurrency MultiClusterRequestConcurrency: 4 
     LogLevel: info 
     CloudVMs: 
       AsyncPermissionsUpdateInterval:     # api-server/async_permissions_update_interval  
     Users: BootProbeCommand: "docker ps -q" 
       AutoSetupNewUsers:                  # api-server/auto_setup_new_users SSHPort: 22 
       AutoSetupNewUsersWithVmUUID:        SyncInterval: 1m      # api-server/auto_setup_new_users_with_vm_uuid how often to get list of active instances from cloud provider 
       AutoSetupNewUsersWithRepository:    TimeoutIdle: 1m       # api-server/auto_setup_new_users_with_repository shutdown if idle longer than this 
       AutoSetupUsernameBlacklist:         TimeoutBooting: 10m # api-server/auto_setup_name_blacklist shutdown if exists longer than this without running BootProbeCommand successfully 
       NewUsersAreActive:                  # api-server/new_users_are_active 
       AutoAdminUserWithEmail:             # api-server/auto_admin_user 
       AutoAdminFirstUser:                 # api-server/auto_admin_first_user 
       UserProfileNotificationAddress:     # api-server/user_profile_notification_address 
       AdminNotifierEmailFrom:             # api-server/admin_notifier_email_from 
       EmailSubjectPrefix:                 # api-server/email_subject_prefix 
       UserNotifierEmailFrom:              # api-server/user_notifier_email_from 
       NewUserNotificationRecipients: TimeoutProbe: 2m      # api-server/new_user_notification_recipients shutdown if (after booting) communication fails longer than this, even if ctrs are running 
       NewInactiveUserNotificationRecipients:    TimeoutShutdown: 1m # api-server/new_inactive_user_notification_recipients shutdown again if node still exists this long after shutdown 
       AnonymousUserToken:                 # workbench/anonymous_user_token, keep-web/AnonymousTokens 
     Login: Driver: Amazon 
       SiteTitle:                   # sso/site_title 
       DefaultLinkTitle:            # sso/default_link_title 
       DefaultLinkURL:              # sso/default_link_url 
       AllowAccountRegistration:    # sso/allow_account_registration 
       RequireEmailConfirmation:    # sso/require_email_confirmation 
       Google: DriverParameters: 
         ClientID:                  # sso/google_oauth2_client_id Region: us-east-1 
         ClientSecret:              # sso/google_oauth2_client_secret 
       LDAP:                        # sso/use_ldap APITimeout: 20s 
         Title:                     # sso/use_ldap.title AWSAccessKeyID: abcdef 
         Host:                      # sso/use_ldap.host AWSSecretAccessKey: abcdefghijklmnopqrstuvwxyz 
         Port:                      # sso/use_ldap.port ImageID: ami-0a01b48b88d14541e 
         Method:                    # sso/use_ldap.method SubnetID: subnet-24f5ae62 
         Base:                      # sso/use_ldap.base 
         Uid:                       # sso/use_ldap.uid 
         EmailDomain:               # sso/use_ldap.email_domain 
         BindDN:                    # sso/use_ldap.BindDN 
         Password:                  # sso/user_ldap.password 
       SecretToken:                 # sso/secret_token 
       ProviderAppSecret:           # api-server/sso_app_secret 
       ProviderAppID:               # api-server/sso_app_id SecurityGroups: sg-3ec53e2a 
     AuditLogs: 
       Enable: 
       MaxAge:                           # api-server/max_audit_log_age 2w 
       MaxDeleteBatch:                   # api-server/max_audit_log_delete_batch DeleteBatchSize: 100000 
       UnloggedAttributes:               {} # api-server/unlogged_attributes (applies to logs table) example: {"manifest_text": true} 
     SystemLogs: ContainerLogStream: 
       LogLevel:                      # keepstore/Debug, keepproxy/Debug, arvados-ws/LogLevel BatchSize: 4KiB 
       Format:                        # keepstore/LogFormat, arvados-ws/LogFormat BatchTime: 1s 
       MaxRequestLogParamsSize:       # api-server/max_request_log_params_size 
     Collections: ThrottlePeriod: 1m 
       DefaultReplication:                   # api-server/default_collection_replication, keepproxy/DefaultReplicas ThrottleThresholdSize: 64KiB 
       DefaultTrashLifetime:                 # api-server/default_trash_lifetime ThrottleThresholdLines: 1024 
       CollectionVersioning:                 # api-server/collection_versioning TruncateSize: 64MiB 
       PreserveVersionIfIdle:                # api-server/preserve_version_if_idle PartialLineThrottlePeriod: 5s 
     Timers: 
       TrustAllContent:                      # keep-web/TrustAllContent, workbench/trust_all_content 
       TrashSweepInterval:                       # api-server/trash_sweep_interval 60s 
       BlobSigningKey:                           # api-server/blob_signing_key, keepstore/BlobSigningKeyFile ContainerDispatchPollInterval: 10s 
       BlobSigningTTL:                           # api-server/blob_signature_ttl, keepstore/BlobSignatureTTL APIRequestTimeout: 20s 
     Scaling: 
       BlobSigning:                              # keepstore/RequireSignatures, api-server/permit_create_collection_with_unsigned_manifest MaxComputeNodes: 64 
       BlobTrash:                                EnablePreemptibleInstances: false 
     DisableAPIMethods: {} # keepstore/EnableDelete example: {"jobs.create": true} 
     DockerImageFormats: {"v2": true} 
     Crunch1: 
       BlobTrashLifetime:                        # keepstore/TrashLifetime Enable: true 
       BlobTrashCheckInterval:                   # keepstore/TrashCheckInterval CrunchJobWrapper: none 
       BlobTrashConcurrency:                     # keepstore/TrashWorkers, keep-balance/-commit-trash CrunchJobUser: crunch 
       BlobDeleteConcurrency:                    # keepstore/EmptyTrashWorkers CrunchRefreshTrigger: /tmp/crunch_refresh_trigger 
       BlobReplicateConcurrency:                 # keepstore/PullWorkers, keep-balance/-commit-pulls 
       KeepBalanceRunPeriod: 10m                 # keepbalance/RunPeriod 
       WebDAVCache: 
         TTL:                     # keep-web/Cache.TTL 
         UUIDTTL:                 # keep-web/Cache.UUIDTTL 
         MaxCollectionEntries:    # keep-web/Cache.MaxCollectionEntries 
         MaxCollectionBytes:      # keep-web/Cache.MaxCollectionBytes 
         MaxPermissionEntries:    # keep-web/Cache.MaxPermissionEntries 
         MaxUUIDEntries:          # keep-web/Cache.MaxUUIDEntries DefaultDockerImage: false 
     Containers: # control how Arvados runs user containers NodeProfiles: 
       SupportedDockerImageFormats:                    # api-server/docker_image_formats 
       LogReuseDecisions:                              # api-server/log_reuse_decisions 
       DefaultKeepCacheRAM:                            # api-server/container_default_keep_cache_ram 
       MaxDispatchAttempts:                            # api-server/max_container_dispatch_attempts 
       MaxRetryAttempts:                               # api-server/container_count_max 
       PollInterval: 10s                               # c-d-s/PollPeriod, a-d-c/Dispatch/PollInterval 
       MinRetryPeriod: 30s                             # c-d-s/MinRetryPeriod (optional? in case ContainerDispatchPollInterval Key is too short) 
       CrunchRunCommand: "crunch-run"                  # c-d-s/CrunchRunCommand 
       CrunchRunArguments: ‘[“-cgroup-parent-subsystem=memory”, “-foo=bar”]’       # c-d-s/CrunchRunCommand (should this a profile name; can be named CrunchRunArgumentsJSON?) specified on service prog command line, defaults to $(hostname) 
       ReserveExtraRAM: 256MiB                         # c-d-s/ReserveExtraRAM 
       UsePreemptibleInstances:                        # api-server/preemptible_instances 
       MaxComputeVMs:                                  # api-server/max_compute_nodes 
       DispatchPrivateKey:                             # a-d-c/Dispatch/PrivateKey 
       StaleLockTimeout:                               # a-d-c/Dispatch/StaleLockTimeout 
       Logging: keep: 
         LogBytesPerEvent:                # api-server/crunch_log_bytes_per_event Don’t run other services automatically -- only specified ones 
         LogSecondsBetweenEvents:         # api-server/crunch_log_seconds_between_events Default: {Disable: true} 
         LogThrottlePeriod:               # api-server/crunch_log_throttle_period Keepstore: {Listen: ":25107"} 
       apiserver: 
         LogThrottleBytes:                # api-server/crunch_log_throttle_bytes Default: {Disable: true} 
         LogThrottleLines:                # api-server/crunch_log_throttle_lines RailsAPI: {Listen: ":9000", TLS: true} 
         LimitLogBytesPerJob:             # api-server/crunch_limit_log_bytes_per_job Controller: {Listen: ":9100"} 
         LogPartialLineThrottlePeriod:    # api-server/crunch_log_partial_line_throttle_period Websocket: {Listen: ":9101"} 
         LogUpdatePeriod:                 # api-server/crunch_log_update_period 
         LogUpdateSize:                   # api-server/crunch_log_update_size 
         MaxAge:                          # api-server/clean_container_log_rows_after, api-server/clean_job_log_rows_after Health: {Listen: ":9199"} 
       CloudVMs: keep: 
         Enable:                                       # arvados-dispatch-cloud is in use Default: {Disable: true} 
         BootProbeCommand:                             # a-d-c/CloudVMs/BootProbeCommand KeepProxy: {Listen: ":9102"} 
         ProbeInterval:                                # a-d-c/Dispatch/ProbeInterval 
         MaxProbesPerSecond:                           # a-d-c/Dispatch/MaxProbesPerSecond 
         TimeoutSignal:                                # a-d-c/Dispatch/TimeoutSignal 
         TimeoutTERM:                                  # a-d-c/Dispatch/TimeoutTERM 
         MaxCloudOpsPerSecond:                         # a-d-c/CloudVMs/MaxCloudOpsPerSecond 
         SSHPort:                                      # a-d-c/CloudVMs/SSHPort 
         SyncInterval:                                 # a-d-c/CloudVMs/SyncInterval 
         TimeoutIdle:                                  # a-d-c/CloudVMs/TimeoutIdle 
         TimeoutBooting:                               # a-d-c/CloudVMs/TimeoutBooting 
         TimeoutProbe:                                 # a-d-c/CloudVMs/TimeoutProbe 
         TimeoutShutdown:                              # a-d-c/CloudVMs/TimeoutShutdown 
         ImageID:                                      # a-d-c/CloudVMs/ImageID 
         Driver: Amazon                                # a-d-c/CloudVMs/Driver 
         DriverParameters:                             # a-d-c/CloudVMs/DriverParameters 
           Region: us-east-1 
           APITimeout: 20s 
           AWSAccessKeyID: abcdef 
           AWSSecretAccessKey: abcdefghijklmnopqrstuvwxyz 
           ImageID: ami-0a01b48b88d14541e 
           SubnetID: subnet-24f5ae62 
           SecurityGroups: sg-3ec53e2a KeepWeb: {Listen: ":9103"} 
       SLURM: *: 
         Enable:                                       # crunch-dispatch-slurm This section used for a node whose profile name is in use not listed above 
         PrioritySpread: 1000                          Default: {Disable: false} # c-d-s/PrioritySpread 
         SbatchArguments: ‘[“-partition=PartitionName”]’                           # c-d-s/SbatchArguments 
         KeepServices: 
           00000-bi6l4-000000000000000: 
             “http://127.0.0.1:25107”                  # c-d-s/KeepServiceURIs 
         Managed: 
           Enable:                          # arvados-node-manager (this is in use 
           DNSServerConfDir:                # api-server/dns_server_conf_dir 
           DNSServerConfTemplate:           # api-server/dns_server_conf_template 
           DNSServerReloadCommand:          # api-server/dns_server_reload_command 
           DNSServerUpdateCommand:          # api-server/dns_server_update_command 
           ComputeNodeDomain:               # api-server/compute_node_domain 
           ComputeNodeNameservers:          # api-server/compute_node_nameservers 
           AssignNodeHostname:              # api-server/assign_node_hostname the default behavior) 
     Volumes: 
       JobsAPI: xyzzy-keep-0: 
         Enable:                          # api-server/enable_legacy_jobs_api (crunch1) Type: s3 
         CrunchJobWrapper:                # api-server/crunch_job_wrapper (crunch1) Region: us-east 
         CrunchJobUser:                   # api-server/crunch_job_user (crunch1) Bucket: xyzzy-keep-0 
         CrunchRefreshTrigger:            # api-server/crunch_refresh_trigger (crunch1) 
         GitInternalDir:                  # api-server/git_internal_dir (crunch1) 
         ReuseJobIfOutputsDiffer:         # api-server/reuse_job_if_outputs_differ 
         DefaultDockerImage:              # api-server/default_docker_image_for_jobs [rest of keepstore volume config goes here] 
     Volumes:                                WebRoutes: 
       # keepstore/Volumes, keep-balance/KeepServiceTypes “default” means route according to method/host/path (e.g., if host is a login shell, route there) 
       xyzzy.arvadosapi.com: default 
       # TODO: some keepstores are closer “collections” means always route to specific volumes keep-web 
       zzzzz-ivpuk-voihjznerfweefq: 
         AccessViaHosts:                       collections.xyzzy.arvadosapi.com: collections 
       # replaces differing configs on keepstore hosts 
           “http://keep0:25107”: {ReadOnly: true} 
           “http://keep1:25107”: {} 
           “http://keep2:25107”: {ReadOnly: true} 
           “http://keep3:25107”: {ReadOnly: true} leading * is a wildcard (longest match wins) 
       "*--collections.xyzzy.arvadosapi.com": collections 
       cloud.curoverse.com: workbench 
       workbench.xyzzy.arvadosapi.com: workbench 
       "*.xyzzy.arvadosapi.com": default 
     InstanceTypes: 
       m4.large: 
         StorageClasses:                       # keepstore/S3Volume.StorageClasses, keepstore/AzureBlobVolume.StorageClasses, keepstore/UnixVolume.StorageClasses 
           default: true 
           cold: true 
         Replication: VCPUs: 2                        # keepstore/S3Volume.S3Replication, keepstore/AzureBlobVolume.AzureReplication, keepstore/UnixVolume.DirectoryReplication 
         ReadOnly: false                       # keepstore/S3Volume.ReadOnly, keepstore/AzureBlobVolume.ReadOnly, keepstore/UnixVolume.ReadOnly RAM: 8000000000 
         Driver: S3                            # keepstore/Volumes[].Type Scratch: 31000000000 
         DriverParameters: 
           AccessKey:                          # keepstore/S3Volume.AccessKey 
           SecretKey:                          # keepstore/S3Volume.SecretKey 
           Endpoint:                           # keepstore/S3Volume.Endpoint 
           Region:                             # keepstore/S3Volume.Region 
           Bucket:                             # keepstore/S3Volume.Bucket 
           LocationConstraint:                 # keepstore/S3Volume.LocationConstraint 
           IndexPageSize:                      # keepstore/S3Volume.IndexPageSize 
           S3Replication: 
           ConnectTimeout:                     # keepstore/S3Volume.ConnectTimeout 
           ReadTimeout:                        # keepstore/S3Volume.ReadTimeout 
           RaceWindow:                         # keepstore/S3Volume.RaceWindow 
           ReadOnly:                           #  
           UnsafeDelete:                       # keepstore/S3Volume.UnsafeDelete Price: 0.1 
       zzzzz-ivpuk-adbtuyuiivjhbnmb: m4.large-1t: 
         AccessViaHosts:                       # replaces differing configs on keepstore hosts (TBD: do we need “readonly from these hosts”?) 
           “http://keep1:25107”: {ReadOnly: false} same instance type as m4.large but our scripts attach more scratch 
         StorageClasses:                       # keepstore/S3Volume.StorageClasses, keepstore/AzureBlobVolume.StorageClasses, keepstore/UnixVolume.StorageClasses 
           default: true 
           cold: false ProviderType: m4.large 
         Replication: VCPUs: 2                        # keepstore/S3Volume.S3Replication, keepstore/AzureBlobVolume.AzureReplication, keepstore/UnixVolume.DirectoryReplication 
         ReadOnly: false                       # keepstore/S3Volume.ReadOnly, keepstore/AzureBlobVolume.ReadOnly, keepstore/UnixVolume.ReadOnly RAM: 8000000000 
         Driver: Azure                         # keepstore/Volumes[].Type Scratch: 999000000000 
         DriverParameters: 
           StorageAccountName:                 # keepstore/AzureBlobVolume.StorageAccountName 
           StorageAccountKey:                  # keepstore/AzureBlobVolume.StorageAccountKeyFile 
           StorageBaseURL:                     # keepstore/AzureBlobVolume.StorageBaseURL 
           ContainerName:                      # keepstore/AzureBlobVolume.ContainerName 
           RequestTimeout:                     # keepstore/AzureBlobVolume.RequestTimeout Price: 0.12 
       zzzzz-ivpuk-2344guvaiubbae4wa: m4.xlarge: 
         Driver: Filesystem                    # keepstore/Volumes[].Type VCPUs: 4 
         DriverParameters: 
           Root:                               # keepstore/UnixVolume.Root 
           Serialize:                          # keepstore/UnixVolume.Serialize 
           BlockDeviceUUID:                    # (disable if this is non-empty and does not match the local filesystem device) 
     Mail: 
       MailchimpAPIKey:              # api-server/mailchimp_api_key 
       MailchimpListID:              # api-server/mailchimp_list_id 
       SendUserSetupNotificationEmail:    # workbench/send_user_setup_notification_email 
       IssueReporterEmailFrom:       # workbench/issue_reporter_email_from 
       IssueReporterEmailTo:         # workbench/issue_reporter_email_to 
       SupportEmailAddress:          # workbench/support_email_address 
       EmailFrom:                    # workbench/email_from 
     RemoteClusters:                 # api-server/remote_hosts 
       xyzzx: RAM: 16000000000 
         Host: Scratch: 78000000000 
         Proxy: false Price: 0.2 
       m4.8xlarge: 
         Scheme: https VCPUs: 40 
         Insecure: false RAM: 160000000000 
         ActivateUsers: false 
       “*”:                          # api-server/remote_hosts_via_dns Scratch: 156000000000 
         ActivateUsers: false 
     Workbench: Price: 2 
       Theme: default                # workbench/arvados_theme 
       ActivationContactLink:        # workbench/activation_contact_link 
       ArvadosDocsite:               # workbench/arvados_docsite 
       ArvadosPublicDataDocURL:      # workbench/arvados_public_data_doc_url 
       ShowUserAgreementInline:      # workbench/show_user_agreement_inline 
       SecretToken:                  # workbench/secret_token 
       SecretKeyBase:                # workbench/secret_key_base 
       RepositoryCache:              # workbench/repository_cache 
       UserProfileFormFields:        # workbench/user_profile_form_fields 
       UserProfileFormMessage        # workbench/user_profile_form_message 
       ApplicationMimetypesWithViewIcon:     # workbench/application_mimetypes_with_view_icon 
       LogViewerMaxBytes:            # workbench/log_viewer_max_bytes 
       EnablePublicProjectsPage:     # workbench/enable_public_projects_page 
       EnableGettingStartedPopup:    # workbench/enable_getting_started_popup 
       ApiResponseCompression:       # workbench/api_response_compression 
       APIClientConnectTimeout:      # workbench/api_client_connect_timeout 
       APIClientReceiveTimeout:      # workbench/api_client_receive_timeout 
       RunningJobLogRecordsToFetch:         # workbench/running_job_log_records_to_fetch 
       ShowRecentCollectionsOnDashboard:    # workbench/show_recent_collections_on_dashboard 
       ShowUserNotifications:        # workbench/show_user_notifications 
       MultiSiteSearch:              # workbench/multi_site_search 
       Repositories:                 # workbench/repositories 
       SiteName:                     # workbench/site_name 
       VocabularyURL:                # workbench2/VOCABULARY_URL 
       FileViewersConfigURL:         # workbench2/FILE_VIEWERS_CONFIG_URL 
     InstanceTypes: 
       x1l: m4.16xlarge: 
         ProviderType: x1.large 
         VCPUs: 16 64 
         RAM: 128GiB 256000000000 
         Scratch: 128GB 310000000000 
         IncludedScratch: 128GB Price: 3.2 
       c4.large: 
         AddedScratch: 0 VCPUs: 2 
         RAM: 3750000000 
         Price: 1.23 0.1 
       c4.8xlarge: 
         Preemptible: false VCPUs: 36 
         RAM: 60000000000 
         Price: 1.591 
     TODO: RemoteClusters: 
       KeepproxyDisableGet                 xrrrr: 
         Host: xrrrr.arvadosapi.com 
         Proxy: true          # keepproxy/DisableGet (retire this feature / use Nginx instead / use a per-token permission instead) 
       KeepproxyDisablePut                 proxy requests to xrrrr on behalf of our clients 
         AuthProvider: true # keepproxy/DisablePut (retire this feature / users authenticated by xrrrr can use Nginx instead / use a per-token permission instead) 
       RailsSessionSecretToken:            # api-server/secret_token (should this be generated at runtime from superusertoken?) 
       InternalIPNetworks:                 # Nginx $external_client our cluster 
 </code></pre> 

 

 h2. Go Configuration Framework Options 

 Viper and go-config seem to be the leading go config framework contenders considering some of our long term goals (config synchronization); but viper seems to be the more widely adopted of the two.  

 *spf13/viper:* https://github.com/spf13/viper 

 *micro/go-config* https://github.com/micro/go-config - more useful - https://micro.mu/docs/go-config.html 

 Both solutions are very similar in terms of reported functionality. Both have watch support, and would allow for merging flags, environment variables, remote key stores (Consul), and our master YAML config. Viper also supports encrypted remote key/value access.