Idea #18947
closedMove remaining services into arvados-server subcommands (keepproxy, keep-web, arv-git-httpd)
Updated by Tom Clegg almost 3 years ago
- Status changed from New to In Progress
keepproxy¶
The generic service.Command logging code gives us request/response logs, and we now have a SetResponseLogFields func, so we now get the standard http response log with the extra keepproxy-specific fields (user_full_name, locatorOut)
Sample keepproxy logs from test on 9tee4 (wb1 upload)
{"ClusterID":"9tee4","Listen":"[::]:25107","PID":30054,"Service":"keepproxy","URL":"http://0.0.0.0:25107/","Version":"e7187c72bd8b2ed54d9af129401294e13a0859f9-dev (go1.17.7)","level":"info","msg":"listening","time":"2022-04-05T19:22:22.087553652Z"}
{"ClusterID":"9tee4","PID":30054,"RequestID":"req-1ryjp6tc16t8c1eneahx","level":"info","msg":"request","remoteAddr":"10.100.35.254:35952","reqBytes":55838099,"reqForwardedFor":"23.16.189.12","reqHost":"keep.9tee4.arvadosapi.com","reqMethod":"POST","reqPath":"","reqQuery":"","time":"2022-04-05T19:23:18.209450201Z"}
{"ClusterID":"9tee4","PID":30054,"RequestID":"req-1ryjp6tc16t8c1eneahx","err":null,"expectLength":55838099,"level":"info","locator":"ba1c2af7f9ef67ee873f2e3fbf733fb7+55838099","msg":"response","remoteAddr":"10.100.35.254:35952","reqBytes":55838099,"reqForwardedFor":"23.16.189.12","reqHost":"keep.9tee4.arvadosapi.com","reqMethod":"POST","reqPath":"","reqQuery":"","respBytes":92,"respStatus":"OK","respStatusCode":200,"time":"2022-04-05T19:23:22.605628387Z","timeToStatus":4.396183,"timeTotal":4.396200,"timeWriteBody":0.000017,"userFullName":"Tom Clegg","userUUID":"9tee4-tpzed-rneoswozqjk5tcv","wantReplicas":2,"wroteReplicas":2}
Notes
- The user fields used to be "user_uuid" and "user_full_name" -- now changed to "userUUID" and "userFullName" to match the style of the other fields.
- The user/locator fields used to be in a separate log entry from the HTTP response details, with msg="Block upload" or "Block download" -- now they are all in one response entry, so msg=response in both cases, and one can tell whether it's read or write by checking presence/absence of wroteReplicas.
Old log entries for startup + a single request:
{"ClusterID":"9tee4","PID":8400,"level":"info","msg":"keepproxy 2.4.0~dev20220321141729 started","time":"2022-03-21T16:11:53.788787289Z"}
{"ClusterID":"9tee4","PID":8400,"level":"info","msg":"listening at [::]:25107","time":"2022-03-21T16:11:53.837066749Z"}
{"ClusterID":"9tee4","PID":8400,"RequestID":"req-nxkc81wwljg2g3gi770a","level":"info","msg":"request","remoteAddr":"10.100.35.254:57342","reqBytes":179,"reqForwardedFor":"35.193.239.132","reqHost":"keep.9tee4.arvadosapi.com","reqMethod":"PUT","reqPath":"ca398ed2b6d31ed830ed77eccbd3700a","reqQuery":"","time":"2022-03-21T16:16:20.713091292Z"}
{"ClusterID":"9tee4","PID":8400,"level":"info","locator":"ca398ed2b6d31ed830ed77eccbd3700a+179","msg":"Block upload","time":"2022-03-21T16:16:20.718637950Z","user_full_name":"Automated Diagnostics","user_uuid":"9tee4-tpzed-oyphd3jt7fhlhsy"}
{"ClusterID":"9tee4","PID":8400,"level":"info","msg":"35.193.239.132,10.100.35.254:57342 PUT /ca398ed2b6d31ed830ed77eccbd3700a 200 179 2 2 ca398ed2b6d31ed830ed77eccbd3700a+179+A15878239ce217acaf9c29bb5457701a226d8f5f1@624b19d4 \u003cnil\u003e","time":"2022-03-21T16:16:20.718724709Z"}
{"ClusterID":"9tee4","PID":8400,"RequestID":"req-nxkc81wwljg2g3gi770a","level":"info","msg":"response","remoteAddr":"10.100.35.254:57342","reqBytes":179,"reqForwardedFor":"35.193.239.132","reqHost":"keep.9tee4.arvadosapi.com","reqMethod":"PUT","reqPath":"ca398ed2b6d31ed830ed77eccbd3700a","reqQuery":"","respBytes":87,"respStatus":"OK","respStatusCode":200,"time":"2022-03-21T16:16:20.718760274Z","timeToStatus":0.005627,"timeTotal":0.005667,"timeWriteBody":0.000040}
18947-keepproxy @ e7187c72bd8b2ed54d9af129401294e13a0859f9 -- developer-run-tests: #3023
Updated by Lucas Di Pentima over 2 years ago
18947-keepproxy
LGTM. I got worried about the systemd file removal, but I think it's not needed by the salt installer, or at least a quick grep
search didn't turned any results.
Updated by Tom Clegg over 2 years ago
- Target version changed from 2022-04-13 Sprint to 2022-04-27 Sprint
Updated by Tom Clegg over 2 years ago
- test suite runs in 30s instead of ~6 minutes
- integration test no longer hangs if you run it twice without manually resetting the test DB fixtures
- tests no longer panic if you use -check.f= to select tests and TestSuite.TestIntegration doesn't run first
The main remaining problem is that the whole thing is a terrible mess with globals and everything. The 27-second test case passes all its checks in 3s but then sits around waiting for goroutines to stop retrying, and there are a lot of them for some reason, and they take a long time for some reason. I ran out of patience.
Not going to bother moving this into arvados-server (see #15370) but might as well merge this so we don't waste as much CI time.
18947-dispatch-local @ 79ed8ab99bb2ad779c873620704519cf02177962 -- developer-run-tests: #3038
Updated by Lucas Di Pentima over 2 years ago
18947-dispatch-local LGTM, provided tests pass.
Updated by Tom Clegg over 2 years ago
18947-githttpd @ ca1c68a7718a44c950f7d61e13a4ecf96da899b6 -- developer-run-tests: #3062
Updated by Lucas Di Pentima over 2 years ago
- There're still some "
arv-git-httpd
" references on the documentation. - I've re-built the arvbox docker image and tried to start it, and I'm getting the following errors:
... 2022-04-25_13:15:24.15538 ++ GOPATH=/var/lib/gopath 2022-04-25_13:15:24.15538 ++ mkdir -p /var/lib/gopath 2022-04-25_13:15:24.16073 ++ cd /usr/src/arvados 2022-04-25_13:15:24.16075 ++ [[ 1000 = 0 ]] 2022-04-25_13:15:24.16076 ++ [[ ! -f /usr/local/bin/arvados-server ]] 2022-04-25_13:15:24.16694 + cd /usr/local/bin 2022-04-25_13:15:24.16703 + ln -sf arvados-server arvados-git-httpd 2022-04-25_13:15:24.17061 + test '' = --only-deps 2022-04-25_13:15:24.17080 + /usr/local/lib/arvbox/runsu.sh flock /var/lib/arvados-arvbox/cluster_config.yml.lock /usr/local/lib/arvbox/cluster-config.sh 2022-04-25_13:15:24.19017 chown: changing ownership of '/dev/stderr': Operation not permitted 2022-04-25_13:15:24.19307 chpst: fatal: unable to setgroups: permission denied
Updated by Tom Clegg over 2 years ago
- update doc refs to arv-git-httpd (but left the filenames/anchors alone to avoid breaking links/bookmarks)
- fix extra runsu.sh usage causing chpst error (I'm not sure why keepproxy didn't fail the same way with the same double-runsu.sh setup, but I fixed it in both places)
Updated by Ward Vandewege over 2 years ago
Tom Clegg wrote:
18947-githttpd @ 01a3368db1de44656e82fbc066e85ae4feb5eb75 -- developer-run-tests: #3074
- update doc refs to arv-git-httpd (but left the filenames/anchors alone to avoid breaking links/bookmarks)
- fix extra runsu.sh usage causing chpst error (I'm not sure why keepproxy didn't fail the same way with the same double-runsu.sh setup, but I fixed it in both places)
LGTM, thanks!
Updated by Peter Amstutz over 2 years ago
- Target version changed from 2022-04-27 Sprint to 2022-05-11 sprint
Updated by Tom Clegg over 2 years ago
18947-keep-web @ 9acc8cb9cd9ca4429712b0d31b647e9a6ecf2d96 -- developer-run-tests: #3082
Updated by Ward Vandewege over 2 years ago
Tom Clegg wrote:
18947-keep-web @ 9acc8cb9cd9ca4429712b0d31b647e9a6ecf2d96 -- developer-run-tests: #3082
In
+++ b/cmd/arvados-server/keep-web.service @@ -3,15 +3,17 @@ # SPDX-License-Identifier: AGPL-3.0 [Unit] -Description=Arvados Keep web gateway +Description=Arvados Keep WebDAV gateway Documentation=https://doc.arvados.org/
Shouldn't the description mention the S3 interface too, if we're going to name WebDAV?
Nice cleanup in lib/service/cmd.go
removing the duplicative interceptMetricsReqs(cluster.ManagementToken, reg, log, ...)
wrapper!
LGTM thanks.
Updated by Tom Clegg over 2 years ago
18947-keep-balance @ d70eee2298afe8d082c72429a919bd94523c6bde -- developer-run-tests: #3089
Updated by Ward Vandewege over 2 years ago
Tom Clegg wrote:
18947-keep-balance @ d70eee2298afe8d082c72429a919bd94523c6bde -- developer-run-tests: #3089
LGTM, thanks! This one was very straighforward.
Updated by Tom Clegg over 2 years ago
18947-health @ 6b1a602cf1de1b6488c7d6b55e8d99ca8f707160 -- developer-run-tests: #3103
Another one where nearly all of the changes were already done.
Updated by Tom Clegg over 2 years ago
18947-dispatch-slurm @ 217aba73f6366cf1af30683baa6d0d5d1e3407a9 -- developer-run-tests: #3106
Updated by Tom Clegg over 2 years ago
- Target version changed from 2022-05-11 sprint to 2022-05-25 sprint
Updated by Ward Vandewege over 2 years ago
Tom Clegg wrote:
18947-dispatch-slurm @ 217aba73f6366cf1af30683baa6d0d5d1e3407a9 -- developer-run-tests: #3106
LGTM. Just one question - is this a good time to also rename crunch-dispatch-slurm to arvados-dispatch-slurm or should that be done in a future ticket?
Updated by Ward Vandewege over 2 years ago
Tom Clegg wrote:
18947-health @ 6b1a602cf1de1b6488c7d6b55e8d99ca8f707160 -- developer-run-tests: #3103
Another one where nearly all of the changes were already done.
LGTM thanks!
Updated by Peter Amstutz over 2 years ago
- Target version changed from 2022-05-25 sprint to 2022-06-08 sprint
Updated by Tom Clegg over 2 years ago
- Status changed from In Progress to Resolved
Updated by Brett Smith almost 2 years ago
- Has duplicate Idea #13874: [CLI] arvados-server "health" subcommand added