Idea #16552
closed"arvados-server init" can get TLS certificates from Let's Encrypt
Added by Peter Amstutz over 4 years ago. Updated almost 2 years ago.
Description
- publicly routable DNS
- port 80 and 443 are reachable for LE validation
Provide "get TLS certificates from Let's Encrypt" option on arvados-server init
command line or wizard.
Generate keys, obtain certificates, and arrange to renew them automatically as needed.
Related issues
Updated by Peter Amstutz over 4 years ago
- Target version changed from 2020-07-15 to 2020-08-12 Sprint
Updated by Peter Amstutz over 4 years ago
- Target version changed from 2020-08-12 Sprint to 2020-08-26 Sprint
Updated by Tom Clegg about 4 years ago
- Target version changed from 2020-08-26 Sprint to 2020-09-09 Sprint
Updated by Tom Clegg about 4 years ago
- Target version changed from 2020-09-09 Sprint to 2020-09-23 Sprint
Updated by Tom Clegg about 4 years ago
- Target version changed from 2020-09-23 Sprint to 2020-10-07 Sprint
Updated by Peter Amstutz about 4 years ago
- Target version changed from 2020-10-07 Sprint to 2020-10-21 Sprint
Updated by Peter Amstutz about 4 years ago
- Related to Idea #15941: arvados-boot added
Updated by Peter Amstutz about 4 years ago
- Target version deleted (
2020-10-21 Sprint)
Updated by Tom Clegg about 3 years ago
- Subject changed from arvados-server config wizard for single node install to "arvados-server init" can get TLS certificates from Let's Encrypt
Updated by Tom Clegg about 3 years ago
- Category set to Deployment
- Description updated (diff)
Updated by Tom Clegg about 3 years ago
- Related to deleted (Idea #15941: arvados-boot)
Updated by Tom Clegg about 3 years ago
- Related to Idea #18337: Easy entry into Arvados ecosystem added
Updated by Tom Clegg over 2 years ago
- Related to Idea #15941: arvados-boot added
Updated by Tom Clegg over 2 years ago
- Status changed from New to In Progress
16552-autocert @ 5722e7f91d3ab4df898dec0d301c0653ac7995b3 -- developer-run-tests: #3207
Updated by Tom Clegg over 2 years ago
16552-autocert @ 1830291389ed69b950f2f94fbb9155c63e6b4679 -- developer-run-tests: #3220
Updated by Tom Clegg over 2 years ago
16552-autocert @ 01b48f7ba1ed76df4277145548fac313a3aca7cd -- developer-run-tests: #3224
16552-autocert @ 039d253a76771d50cee07503cb08494b6b7e2461 (install doc update)
Updated by Tom Clegg over 2 years ago
16552-autocert @ 20d7bfd30b4c890246c7ad72d6c96f93417f12ee -- developer-run-tests: #3228
retry remainder developer-run-tests-remainder: #3390
Updated by Tom Clegg over 2 years ago
16552-autocert @ e12c1fed6336048d6ab854bbfab95eccf7c1b372 -- developer-run-tests: #3230
(wb1 failed) ...with a fix for the race-induced port collision that was causing lib/controller multi-cluster integration tests to be unreliable.
Updated by Lucas Di Pentima over 2 years ago
Some comments & questions:
- I found https://github.com/letsencrypt/pebble that might help us having the LE autocert code tested, wdyt?
- What's "acme" option for? is it to provide LE certs managed by something else?
- Do you think it would be a bit more clear to move the config knobs
TLS.Automatic
&TLS.Staging
to a parent key, likeTLS.LE.Enable
&TLS.LE.Staging
? - Testing the
arvados-easy-package
in a DigitalOcean droplet:- Postgresql wasn't installed, so it crashed right away
- After installing postgresql, removing the config file created in the first attempt and re-running "init" I got a postgresql database and an empty nginx service.
- Checking the syslogs I'm seeing:
arvados-server[25430]: {"PID":25430,"error":"autocert http-01 challenge handler stopped: listen tcp :80: bind: address already in use","level":"error","msg":"task failed","task":"certificates","time":"2022-07-15T20:17:57.652253479Z"}
- Stopping nginx & restarting arvados allowed me to get it to work with SSL
- Keep services seem to not be working because the client tries to connect to
0.0.0.0:9010
Updated by Tom Clegg over 2 years ago
Re testing/pebble: It would be good to test the autocert stuff (start pebble server, boot a test cluster, connect to controller, and check that the cert was issued by pebble?) ... but we can't even listen on port 80 with our non-root testing setup. I'm wary that a test case could end up uselessly testing some test-only code paths and autocert itself. Maybe enabling this on a future dev cluster, and using it to get real certs, is the best way to protect it from regressions?
The -tls=acme
option is for when you want something like acmetool to be responsible for obtaining certificates and putting them in /var/lib/acme/. The idea is to let you get real certs, inject them with a bind mount, and test the arvados package/boot stuff in a container without worrying about making the container reachable by LE validation, hitting LE rate limits, etc.
I'm thinking "-tls=auto" could be renamed to "-tls=acme", and the old "-tls=acme" could be replaced with "-tls=/path" meaning load key and cert from "/path/privkey" and "/path/cert". WDYT? (see updated branch)
Good point about the config keys, too. Since ACME is the real feature (in principle the only Let's Encrypt-specific thing is the built-in Directory URL) I thought this might work:
TLS:
Certificate: ""
Key: ""
Insecure: false
ACME:
# Obtain certificates automatically for ExternalURL domains
# using an ACME server and http-01 validation.
#
# To use Let's Encrypt, specify "LE". To use the Let's
# Encrypt staging environment, specify "LE-staging". To use a
# different ACME server, specify the full directory URL
# ("https://...").
#
# Note: this feature is not yet implemented in released
# versions, only in the alpha/prerelease arvados-server-easy
# package.
#
# Implies agreement with the server's terms of service.
Server: ""
So a production example would be just
TLS:
ACME:
Server: LE
Some of your potholes are addressed in #17344:
- postgres not installed => early check for this, and the doc page says to install it first
- nginx taking port 80 => disable the default debian placeholder site if that appears to be the only nginx config
Keep clients connecting to 0.0.0.0:9010 is one I hadn't noticed, but makes sense. I suppose we need to either use the "external client" flag, or put keepstore on the external interface instead. I've updated #17344 to ask for this fix.
16552-autocert @ 2f0c775a9e1ab8c3abdd94c854326fab771c4b5e -- developer-run-tests: #3235 (wb1 failed)
Updated by Lucas Di Pentima over 2 years ago
Tom Clegg wrote in #note-24:
The
-tls=acme
option is for when you want something like acmetool to be responsible for obtaining certificates and putting them in /var/lib/acme/. The idea is to let you get real certs, inject them with a bind mount, and test the arvados package/boot stuff in a container without worrying about making the container reachable by LE validation, hitting LE rate limits, etc.I'm thinking "-tls=auto" could be renamed to "-tls=acme", and the old "-tls=acme" could be replaced with "-tls=/path" meaning load key and cert from "/path/privkey" and "/path/cert". WDYT? (see updated branch)
Looks better, thanks!
Good point about the config keys, too. Since ACME is the real feature (in principle the only Let's Encrypt-specific thing is the built-in Directory URL) I thought this might work:
16552-autocert @ 2f0c775a9e1ab8c3abdd94c854326fab771c4b5e -- developer-run-tests: #3235 (wb1 failed)
This LGTM, thanks!
Updated by Tom Clegg over 2 years ago
- Status changed from In Progress to Resolved
Applied in changeset arvados-private:commit:arvados|3c87fb14f48b78d30142f12c8cb855dba92c926d.