Idea #15954
closed[boot] Bring up test cluster using provided config file and source tree
Description
The following commands should bring up a functioning test cluster:
git clone https://github.com/arvados/arvados go run arvados/cmd/arvados-server boot -source-tree ./arvados -config ./arvados/doc/examples/config/zzzzz.yml -temp-dir {...} -test-fixtures=trueAssuming the user has taken care of these prerequisites:
- PostgreSQL, Ruby, Ruby gems/bundle, Python, nginx, etc. are installed
- Use a new temporary directory, and delete it when exiting
- Use any local/uncommitted changes in the ./arvados work tree
- Use {temp dir}/keep/ as a keep volume backend
- Use {temp dir}/ for any pid/lock/temp files
- Have no sso, workbench2, or composer
- Stay in the foreground
- Log to stderr (OK for the time being if some logs go to {provided temp dir}/ instead)
- Exit (and shut down any child processes) when SIGINT or SIGTERM is received or a child service/component fails
Updated by Tom Clegg almost 5 years ago
- Related to Idea #15941: arvados-boot added
Updated by Tom Clegg almost 5 years ago
- Description updated (diff)
- Subject changed from [boot] Bring up dev cluster using provided config file and source tree to [boot] Bring up test cluster using provided config file and source tree
Updated by Tom Clegg almost 5 years ago
- Target version set to 2020-01-29 Sprint
- Assigned To set to Tom Clegg
- Status changed from New to In Progress
Updated by Peter Amstutz almost 5 years ago
- Target version changed from 2020-01-29 Sprint to 2020-02-12 Sprint
Updated by Tom Clegg almost 5 years ago
- Target version changed from 2020-02-12 Sprint to 2020-02-26 Sprint
Updated by Peter Amstutz almost 5 years ago
- Target version changed from 2020-02-26 Sprint to 2020-03-11 Sprint
Updated by Tom Clegg almost 5 years ago
15954-boot-test-cluster @ 8a719dbcdfd5da64172855ace2395ce682941214 -- developer-run-tests: #1756
(fuse tests fail because #16151)
On a system with all the dependencies needed by run-tests.sh, this brings up a test cluster on port 12345 using the code in CWD:
~/arvados$ go run ./cmd/arvados-server boot -config ./doc/examples/config/zzzzz.yml -type test -own-temporary-database -controller-address :12345 -listen-host 0.0.0.0
- "https://0.0.0.0:12345/" (controller endpoint) appears on stdout when the cluster is ready to use
- everything else (i.e., logging) is sent to stderr
- ^C or SIGTERM shuts down all child processes before exiting
There is a new test suite in source:lib/controller/integration_test.go that boots 3 clusters, with one test that saves a collection on cluster A and retrieves it by PDH from cluster B.
Updated by Tom Clegg almost 5 years ago
15954-boot-test-cluster @ a15c20803fb7a1e400a028c00d1c2dd924765a3e -- developer-run-tests: #1757
(merged master to get #16151 fix)
Updated by Lucas Di Pentima almost 5 years ago
Some comments & questions:
- File
lib/boot/cmd.go
- Line 25: “…should call cancel.” comment, is referring to the
super.cancel()
orfail()
func? - Line 81: Shouldn’t be logged using
super.logger()
? - Line 85: I think the
else
clause could be avoided.
- Line 25: “…should call cancel.” comment, is referring to the
- File
lib/boot/supervisor.go
- Line 75: Shouldn’t be logged using
super.logger()
? - Lines 99, 109: Can we use a more strict permission scheme for dirs/files creation?
- Line 493: Why does
autofillConfig()
need the logger to be passed as an argument if it’s already on theSupervisor
struct? It also seems that it’s not being used. - Lines 521-525: Can this be replaced with a
nextPort()
call?
- Line 75: Shouldn’t be logged using
- File
lib/boot/cert.go
- Line 49: Can we use a more strict permission scheme? - Other Qs (probably out of scope of this particular story):
- Do you think adding a
-only-install-deps
flag would be useful to do some cache population? - What happens to an owned temporary database after quitting? Can we have a not-so-temporary database too?
- Do you think adding a
Updated by Tom Clegg almost 5 years ago
- File
lib/boot/cmd.go
- Line 25: “…should call cancel.” comment, is referring to the
super.cancel()
orfail()
func?
Oops, changed to "fail".
- Line 81: Shouldn’t be logged using
super.logger()
?
This goes to stdout so a script can easily find the controller URL when it's ready. (Added a comment.)
- Line 85: I think the
else
clause could be avoided.
Yes, fixed to use handle-errors-first style.
- File
lib/boot/supervisor.go
- Line 75: Shouldn’t be logged using
super.logger()
?
Yes, fixed.
- Lines 99, 109: Can we use a more strict permission scheme for dirs/files creation?
Yes, fixed. (Typically umask is 022, and this is all in a temp dir with 0700, but turning off group/other-write seems sensible anyway... and config.yml sure doesn't need to be executable.)
- Line 493: Why does
autofillConfig()
need the logger to be passed as an argument if it’s already on theSupervisor
struct? It also seems that it’s not being used.
Indeed, removed.
- Lines 521-525: Can this be replaced with a
nextPort()
call?
Yes, done.
- File
lib/boot/cert.go
- Line 49: Can we use a more strict permission scheme?
Done
- Other Qs (probably out of scope of this particular story):
- Do you think adding a
-only-install-deps
flag would be useful to do some cache population?
Yes, either that or "after starting, shutdown and exit 0" which would give more assurance that setup/deps actually worked.
- What happens to an owned temporary database after quitting? Can we have a not-so-temporary database too?
Yes, for a more convenient dev/trial experience we could put a persistent data dir in /var/lib/arvados and run a dedicated postgresql server on demand -- but for production I imagine we'll still recommend providing connection info for a regular postgresql installation so we don't need to handle tuning, backups, migrating data after upgrading postgresql, etc.
15954-boot-test-cluster @ a9988d4cde254df59d1790ef1e3768d14e2a812e -- developer-run-tests: #1769
Updated by Tom Clegg almost 5 years ago
- Status changed from In Progress to Resolved