Project

General

Profile

Actions

Bug #22275

open

arv-cluster-activity crashes if your Prometheus settings are wrong

Added by Brett Smith 5 months ago. Updated 5 months ago.

Status:
New
Priority:
Normal
Assigned To:
-
Category:
Crunch
Target version:
-
Story points:
-

Description

I don't expect the report to work but it should detect this problem and provide a nicer error message than this giant backtrace.

$ PROMETHEUS_HOST=[something wrong] arv-cluster-activity --start 2024-10-01 --end 2024-11-01 --cost-report-file costs.csv --html-report-file costs.html                                                                                          
INFO:root:Exporting workflow runs 0 - 66
INFO:root:Getting container hours time series
Traceback (most recent call last):
  File "/home/brett/Scratch/activity/lib/python3.12/site-packages/requests/models.py", line 974, in json
    return complexjson.loads(self.text, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/json/__init__.py", line 346, in loads
    return _default_decoder.decode(s)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/json/decoder.py", line 337, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/json/decoder.py", line 355, in raw_decode
    raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/brett/Scratch/activity/bin/arv-cluster-activity", line 8, in <module>
    sys.exit(main())
             ^^^^^^
  File "/home/brett/Scratch/activity/lib/python3.12/site-packages/arvados_cluster_activity/main.py", line 193, in main
    f.write(reporter.html_report(since, to, args.exclude, args.include_workflow_steps))
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/brett/Scratch/activity/lib/python3.12/site-packages/arvados_cluster_activity/report.py", line 163, in html_report
    self.graphs[containers_graph] = self.collect_graph(since, to,
                                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/brett/Scratch/activity/lib/python3.12/site-packages/arvados_cluster_activity/report.py", line 124, in collect_graph
    for series in get_metric_usage(self.prom_client, since, to, metric % self.cluster, resampleTo=resample_to):
  File "/home/brett/Scratch/activity/lib/python3.12/site-packages/arvados_cluster_activity/prometheus.py", line 20, in get_metric_usage
    metric_data = prom.custom_query_range(metric,
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/brett/Scratch/activity/lib/python3.12/site-packages/prometheus_api_client/prometheus_connect.py", line 453, in custom_query_range
    data = response.json()["data"]["result"]
           ^^^^^^^^^^^^^^^
  File "/home/brett/Scratch/activity/lib/python3.12/site-packages/requests/models.py", line 978, in json
    raise RequestsJSONDecodeError(e.msg, e.doc, e.pos)

Alternatively, if your host is right but your credentials are wrong, you get:

Traceback (most recent call last):
  File "/home/brett/Scratch/activity/bin/arv-cluster-activity", line 8, in <module>
    sys.exit(main())
             ^^^^^^
  File "/home/brett/Scratch/activity/lib/python3.12/site-packages/arvados_cluster_activity/main.py", line 193, in main
    f.write(reporter.html_report(since, to, args.exclude, args.include_workflow_steps))
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/brett/Scratch/activity/lib/python3.12/site-packages/arvados_cluster_activity/report.py", line 163, in html_report
    self.graphs[containers_graph] = self.collect_graph(since, to,
                                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/brett/Scratch/activity/lib/python3.12/site-packages/arvados_cluster_activity/report.py", line 124, in collect_graph
    for series in get_metric_usage(self.prom_client, since, to, metric % self.cluster, resampleTo=resample_to):
  File "/home/brett/Scratch/activity/lib/python3.12/site-packages/arvados_cluster_activity/prometheus.py", line 20, in get_metric_usage
    metric_data = prom.custom_query_range(metric,
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/brett/Scratch/activity/lib/python3.12/site-packages/prometheus_api_client/prometheus_connect.py", line 455, in custom_query_range
    raise PrometheusApiClientException(
prometheus_api_client.exceptions.PrometheusApiClientException: HTTP Status Code 403 (b'<html>\r\n<head><title>403 Forbidden</title></head>\r\n<body>\r\n<center><h1>403 Forbidden</h1></center>\r\n<hr><center>nginx/1.18.0</center>\r\n</body>\r\n</html>\r\n')
Actions #1

Updated by Brett Smith 5 months ago

  • Description updated (diff)
Actions

Also available in: Atom PDF