Feature #17609

arvados-client subcommand to run diagnostics on already installed cluster

Added by Peter Amstutz 14 days ago. Updated about 15 hours ago.

Status:
New
Priority:
Normal
Assigned To:
Category:
Deployment
Target version:
Start date:
Due date:
% Done:

0%

Estimated time:
Story points:
-

Description

This is the list of tests will do

https://docs.google.com/spreadsheets/d/1--O03eo9-5gQYnP5eBti9a6E6ZYApM_lpnRsYZo9pqM/edit#gid=0https://docs.google.com/spreadsheets/d/1--O03eo9-5gQYnP5eBti9a6E6ZYApM_lpnRsYZo9pqM/edit#gid=0

Then once we have the list will include it to the arvados-client test

  • Run the tests that can be run:
    • If config.yml is available, check that
    • If cypress can be run, run browser-based tests
  • Warn about what can be run / cannot be run
  • put everything into a diagnostics project

Ward's 3 electric rails:

  • uploading through keepproxy
  • running workflows
  • properly configured keep-web
    • uploading via webdav
    • downloading via webdav and s3

Nico's tests:

  • Fetching discovery document / public config
  • Check hostnames, ports, certificates of service ExternalURL are valid
  • Check nginx geo section

Tom's modes:

  • User option to run assuming it is inside (check that things treat you as inside)
  • User option to run assuming it is outside (check that things treat you as outside)

Healthcheck:

  • Use healthcheck endpoints, see if some tests can be part of healthcheck
    • Any check that can be done as a healthcheck, probably should be
  • Needs management token
  • Use healthcheck aggregator
$ arvados-client diagnostics --inside
Checking connectivity to https://api.arvados.example.com ...OK
Checking TLS certificate on https://api.arvados.example.com ...FAIL

Guidelines:

  • run arvados-server check-config as early as possible.
  • verbose mode that communicates as much as possible about what each test is trying to do
  • be very explicit about failures

Related issues

Related to Arvados Epics - Story #16444: Improved error detection/reportingNew09/30/202110/31/2021

History

#1 Updated by Peter Amstutz 14 days ago

  • Subject changed from Installed cluster diagnostic test to arvados-client subcommand to run diagnostics on already installed cluster

#2 Updated by Nico César 14 days ago

  • Assigned To set to Nico César

#3 Updated by Nico César 14 days ago

  • Description updated (diff)

#4 Updated by Peter Amstutz 1 day ago

  • Description updated (diff)

#5 Updated by Peter Amstutz 1 day ago

  • Related to Story #16444: Improved error detection/reporting added

#6 Updated by Peter Amstutz 1 day ago

  • Assigned To deleted (Nico César)

#7 Updated by Tom Clegg about 15 hours ago

  • Assigned To set to Tom Clegg

Also available in: Atom PDF