Project

General

Profile

Cluster configuration » History » Version 10

Tom Clegg, 07/09/2018 01:59 PM

1 1 Tom Clegg
h1. Cluster configuration
2
3
We are (2018) consolidating configuration from per-microservice yaml/json/ini files into a single cluster configuration document that is used by all components.
4
* Long term: system nodes automatically keep their configs synchronized (using something like consul).
5
* Short term: sysadmin uses tools like puppet and terraform to ensure /etc/arvados/config.yml is identical on all system nodes.
6
* Hosts without config files (e.g., hosts outside the cluster) can retrieve the config document from the API server.
7
8
h2. Discovery document
9
10
Previously, we copied selected config values from the API server config into the API discovery document so clients could see them. When clients can get the configuration document itself, this won't be needed. The discovery document should advertise APIs provided by the server, not cluster configuration.
11
12 7 Tom Clegg
h2. Secrets
13
14
Secrets like BlobSigningKey can be given literally in the config file (convenient for dev/test, consul-template, etc) or indirectly using a secret backend. Anticipated backends:
15
* <code class="yaml">BlobSigningKey: foobar</code> &rArr; the secret is literally <code>foobar</code>
16
* <code class="yaml">BlobSigningKey: "vault:foobar"</code> &rArr; the secret can be obtained from vault using the vault key "foobar"
17
* <code class="yaml">BlobSigningKey: "file:/foobar"</code> &rArr; the secret can be read from the local file @/foobar@
18
* <code class="yaml">BlobSigningKey: "env:FOOBAR"</code> &rArr; the secret can be read from the environment variable @FOOBAR@
19
20 1 Tom Clegg
h2. Example config file
21
22
(Format not yet frozen!)
23
24
<pre><code class="yaml">
25
Clusters:
26
  xyzzy:
27
    BlobSigningKey: ungu355able
28
    BlobSignatureTTL: 172800
29 6 Tom Clegg
    SessionKey: 186005aa54cab1ca95a3738e6e954e0a35a96d3d13a8ea541f4156e8d067b4f3
30 4 Tom Clegg
    PostgreSQL:
31 10 Tom Clegg
      Connection:
32
        # All parameters here are passed to the PG client library in a connection string;
33
        # see https://www.postgresql.org/docs/current/static/libpq-connect.html#LIBPQ-PARAMKEYWORDS
34
        Host: localhost
35
        Port: 5432
36
        User: arvados
37
        Password: s3cr3t
38
        DBName: arvados_production
39
        client_encoding: utf8
40
        fallback_application_name: arvados
41 4 Tom Clegg
    HTTPRequestTimeout: 5m
42 6 Tom Clegg
    Defaults:
43
      CollectionReplication: 2
44
      TrashLifetime: 2w
45
    UserActivation:
46
      ActivateNewUsers: true
47
      AutoAdminUser: root@example.com
48
      UserProfileNotificationAddress: notify@example.com
49 8 Tom Clegg
      NewUserNotificationRecipients: {}
50
      NewInactiveUserNotificationRecipients: {}
51 6 Tom Clegg
    Limits:
52
      MaxRequestLogParamsSize: 2KB
53
      MaxRequestSize: 128MiB
54
      MaxIndexDatabaseRead: 128MiB
55
      MaxItemsPerResponse: 1000
56
    AuditLogs:
57
      MaxAge: 2w
58
      DeleteBatchSize: 100000
59 8 Tom Clegg
      UnloggedAttributes: {} # example: {"manifest_text": true}
60 6 Tom Clegg
    ContainerLogStream:
61
      BatchSize: 4KiB
62
      BatchTime: 1s
63
      ThrottlePeriod: 1m
64
      ThrottleThresholdSize: 64KiB
65
      ThrottleThresholdLines: 1024
66
      TruncateSize: 64MiB
67
      PartialLineThrottlePeriod: 5s
68
    Timers:
69
      TrashSweepInterval: 60s
70
    Scaling:
71
      MaxComputeNodes: 64
72
      EnablePreemptibleInstances: false
73 8 Tom Clegg
    DisableAPIMethods: {} # example: {"jobs.create": true}
74
    DockerImageFormats: {"v2": true}
75 6 Tom Clegg
    Crunch1:
76
      Enable: true
77
      CrunchJobWrapper: none
78
      CrunchJobUser: crunch
79
      CrunchRefreshTrigge: /tmp/crunch_refresh_trigger
80
      DefaultDockerImage: false
81 4 Tom Clegg
    NodeProfiles:
82
      # Key is a profile name; can be specified on service prog command line, defaults to $(hostname)
83
      keep:
84
        # Don’t run other services automatically -- only specified ones
85
        Default: {Disable: true}
86
        Keepstore: {Listen: ":25107"}
87
      apiserver:
88
        Default: {Disable: true}
89
        RailsAPI: {Listen: ":9000", TLS: true}
90
        Controller: {Listen: ":9100"}
91
        Websocket: {Listen: ":9101"}
92
        Health: {Listen: ":9199"}
93
      keep:
94
        Default: {Disable: true}
95
        KeepProxy: {Listen: ":9102"}
96
        KeepWeb: {Listen: ":9103"}
97
      *:
98
        # This section used for a node whose profile name is not listed above
99
        Default: {Disable: false} # (this is the default behavior)
100 1 Tom Clegg
    Volumes:
101
      xyzzy-keep-0:
102
        Type: s3
103
        Region: us-east
104
        Bucket: xyzzy-keep-0
105
        # [rest of keepstore volume config goes here]
106
    Providers:
107
      AWS:
108
        # [credentials and stuff go here]
109 4 Tom Clegg
    WebRoutes:
110 5 Tom Clegg
      # “default” means route according to method/host/path (e.g., if host is a login shell, route there)
111 4 Tom Clegg
      xyzzy.arvadosapi.com: default
112
      # “collections” means always route to keep-web
113
      collections.xyzzy.arvadosapi.com: collections
114
      # leading * is a wildcard (longest match wins)
115
      "*--collections.xyzzy.arvadosapi.com": collections
116
      cloud.curoverse.com: workbench
117
      workbench.xyzzy.arvadosapi.com: workbench
118
      "*.xyzzy.arvadosapi.com": default
119 3 Tom Clegg
    InstanceTypes:
120 8 Tom Clegg
      m4.large:
121
        VCPUs: 2
122
        RAM: 8000000000
123
        Scratch: 31000000000
124
        Price: 0.1
125
      m4.large-1t:
126
        # same instance type as m4.large but our scripts attach more scratch
127
        ProviderType: m4.large
128
        VCPUs: 2
129
        RAM: 8000000000
130
        Scratch: 999000000000
131
        Price: 0.12
132
      m4.xlarge:
133
        VCPUs: 4
134
        RAM: 16000000000
135
        Scratch: 78000000000
136
        Price: 0.2
137
      m4.8xlarge:
138
        VCPUs: 40
139
        RAM: 160000000000
140
        Scratch: 156000000000
141
        Price: 2
142
      m4.16xlarge:
143
        VCPUs: 64
144
        RAM: 256000000000
145
        Scratch: 310000000000
146
        Price: 3.2
147
      c4.large:
148
        VCPUs: 2
149
        RAM: 3750000000
150
        Price: 0.1
151
      c4.8xlarge:
152
        VCPUs: 36
153
        RAM: 60000000000
154
        Price: 1.591
155 9 Tom Clegg
    RemoteClusters:
156
      xrrrr:
157
        Host: xrrrr.arvadosapi.com
158
        Proxy: true        # proxy requests to xrrrr on behalf of our clients
159
        AuthProvider: true # users authenticated by xrrrr can use our cluster
160 1 Tom Clegg
</code></pre>