Project

General

Profile

Actions

Feature #20612

closed

diagnostic container should in check for connectivity to API server

Added by Peter Amstutz 11 months ago. Updated 8 months ago.

Status:
Resolved
Priority:
Normal
Assigned To:
Category:
Docker
Story points:
-
Release relationship:
Auto

Description

The arvados-client diagnostics command submits a container which runs to confirm that crunch is working. A common source of misconfiguration that isn't currently detected by the check is the ability for tools within the container to contact the API server. The diagnostic should enable API access and the container should run a pass/fail test that attempts to contact the API server inside the container.


Subtasks 1 (0 open1 closed)

Task #20873: Review 20612-diag-ctr-api-accessResolvedTom Clegg08/28/2023Actions
Actions #1

Updated by Peter Amstutz 11 months ago

  • Description updated (diff)
Actions #2

Updated by Peter Amstutz 9 months ago

  • Target version changed from Future to Development 2023-08-30
Actions #3

Updated by Peter Amstutz 9 months ago

  • Assigned To set to Tom Clegg
Actions #4

Updated by Tom Clegg 8 months ago

Currently the diagnostics command, by default, uploads and uses a tiny "hello-world" image embedded in the arvados-client binary, which is great because it doesn't depend on having docker on the client host, or having any images saved in arvados, etc. However, the hello-world image is not capable of doing http requests.

I tried to hack an absolutely minimal (15 MB) arvados-client binary image by inspecting the "ldd" output and copying the relevant library files into a "from scratch" image. I couldn't quite get it to work, and I was getting the impression that even if I could, it would be fragile and hard to troubleshoot/fix when it stops working. So I backed away from that. (It's possible a "minimal image from scratch" hack would work well if arvados-client is statically linked, but that doesn't seem like the right scenario to optimize for at this point.)

Instead I added a -docker-image-from=X option (default debian:slim-stable) and an auto-build recipe that installs libfuse and ca-certificates and the arvados-client binary (/proc/self/exe), so now you can run with
  • -docker-image="" (default): auto-build and upload an image containing arvados-client itself, and run diagnostics from inside the test container (this also checks that the "internal client" detection works for the container process)
  • -docker-image=hello-world: run the "hello world" program using the embedded image
  • -docker-image=tag_or_pdh: run "echo {timestamp}" using an already-uploaded image
Assuming this seems reasonable so far, todo:

20612-diag-ctr-api-access @ c03ce6b41430afbe6afea76c9448f6895fd18781 -- developer-run-tests: #3800

(not bothering to retry wb1 integration tests)

Actions #5

Updated by Tom Clegg 8 months ago

  • Status changed from New to In Progress
Actions #6

Updated by Brett Smith 8 months ago

Tom Clegg wrote in #note-4:

20612-diag-ctr-api-access @ c03ce6b41430afbe6afea76c9448f6895fd18781 -- developer-run-tests: #3800

Couple minor nitpicky suggestions:

The help for -docker-image-from says: "(use a debian-based image similar this host's OS for best results)". I don't feel confident I understand what you mean here, which makes me think our users won't either. I also think it could be clearer that this image must be Debian-based for it to work at all. What other factors make an image "similar?" What's a good value for a cluster on Red Hat?

My understanding is it would still be best practice for the Dockerfile to include DEBIAN_FRONTEND=noninteractive when installing packages. I realize this kind of lore tends to outlive reality though, so if you know better let me know.

Thanks.

Actions #7

Updated by Peter Amstutz 8 months ago

  • Target version changed from Development 2023-08-30 to Development 2023-09-13 sprint
Actions #8

Updated by Tom Clegg 8 months ago

Added the "noninteractive" bit.

Added a section about image/container-running options to the doc page, and replaced the cryptic "for best results" note in the usage text with a link to the doc page.

20612-diag-ctr-api-access @ 27ae52da2c6bbe5ecd0bf2262b3f190597b7415c

This still doesn't directly address "what's a good value for a cluster on Red Hat", though. I suspect the answer is that the default also works fine on a current Red Hat or Rocky system. I could at least try that and, if it's true, mention it on the doc page to reduce the guesswork. How does that sound?

Actions #9

Updated by Brett Smith 8 months ago

Tom Clegg wrote in #note-8:

20612-diag-ctr-api-access @ 27ae52da2c6bbe5ecd0bf2262b3f190597b7415c

This looks good, thanks. Couple minor consistency things that would be nice to address but I don't need to review them:

  • "Docker image" is capitalized inconsistently throughout the text, IMO it should always be capital-D lowercase-i. This matches the official Docker documentation.
  • The first bullet point has an explanatory "so" clause after a comma, while the second bullet point has it in parentheses. Would be nice to make these consistent. IMO no parentheses is cleaner but I don't feel strongly about which way it's resolved.

This still doesn't directly address "what's a good value for a cluster on Red Hat", though. I suspect the answer is that the default also works fine on a current Red Hat or Rocky system. I could at least try that and, if it's true, mention it on the doc page to reduce the guesswork. How does that sound?

I don't think that's necessary, the question was mostly rhetorical to point out the kinds of questions raised by the old text. I agree that it seems like the current Debian-based strategy should still work on Red Hat as long as the required kernel similarities are met, and personally I'm comfortable moving ahead with that even without explicitly testing and documenting it.

Actions #10

Updated by Peter Amstutz 8 months ago

  • Release set to 66
Actions #11

Updated by Tom Clegg 8 months ago

  • % Done changed from 0 to 100
  • Status changed from In Progress to Resolved
Actions

Also available in: Atom PDF