Actions
Bug #21258
closedlib/dispatchcloud TestDispatchToStubDriver is racey
Status:
Resolved
Priority:
Normal
Assigned To:
Category:
Tests
Target version:
Story points:
-
Release:
Release relationship:
Auto
Description
Failure mode 2¶
This failure was seen in a run that had b015c9e45f2a81b7069e5ecde3e0e9e0c5c619fa merged.
dispatcher_test.go:297: c.Check(resp.Body.String(), check.Matches, `(?ms).*boot_outcomes{outcome="failure"} [^0].*`) ... value string = "" + ... "# HELP arvados_dispatchcloud_at_quota Flag indicating the cloud driver is reporting an at-quota condition.\n" + ... "# TYPE arvados_dispatchcloud_at_quota gauge\n" + ... "arvados_dispatchcloud_at_quota 0\n" + ... "# HELP arvados_dispatchcloud_boot_outcomes Boot outcomes by type.\n" + ... "# TYPE arvados_dispatchcloud_boot_outcomes counter\n" + ... "arvados_dispatchcloud_boot_outcomes{outcome=\"aborted\"} 7\n" + ... "arvados_dispatchcloud_boot_outcomes{outcome=\"disappeared\"} 5\n" + ... "arvados_dispatchcloud_boot_outcomes{outcome=\"failure\"} 0\n" + ... "arvados_dispatchcloud_boot_outcomes{outcome=\"success\"} 41\n" + ... "# HELP arvados_dispatchcloud_containers_allocated_not_started Number of containers allocated to a worker but not started yet (worker is booting).\n" + ... "# TYPE arvados_dispatchcloud_containers_allocated_not_started gauge\n" + ... "arvados_dispatchcloud_containers_allocated_not_started 0\n" + ... "# HELP arvados_dispatchcloud_containers_longest_wait_time_seconds Current longest wait time of any container since queuing, and before the start of crunch-run.\n" + ... "# TYPE arvados_dispatchcloud_containers_longest_wait_time_seconds gauge\n" + ... "arvados_dispatchcloud_containers_longest_wait_time_seconds 0\n" + ... "# HELP arvados_dispatchcloud_containers_not_allocated_over_quota Number of containers not allocated to a worker because the system has hit a quota.\n" + ... "# TYPE arvados_dispatchcloud_containers_not_allocated_over_quota gauge\n" + ... "arvados_dispatchcloud_containers_not_allocated_over_quota 0\n" + ... "# HELP arvados_dispatchcloud_containers_running Number of containers reported running by cloud VMs.\n" + ... "# TYPE arvados_dispatchcloud_containers_running gauge\n" + ... "arvados_dispatchcloud_containers_running 0\n" + ... "# HELP arvados_dispatchcloud_containers_time_from_queue_to_crunch_run_seconds Number of seconds between the queuing of a container and the start of crunch-run.\n" + ... "# TYPE arvados_dispatchcloud_containers_time_from_queue_to_crunch_run_seconds summary\n" + ... "arvados_dispatchcloud_containers_time_from_queue_to_crunch_run_seconds{quantile=\"0.5\"} 9.223372036854776e+09\n" + ... "arvados_dispatchcloud_containers_time_from_queue_to_crunch_run_seconds{quantile=\"0.9\"} 9.223372036854776e+09\n" + ... "arvados_dispatchcloud_containers_time_from_queue_to_crunch_run_seconds{quantile=\"0.95\"} 9.223372036854776e+09\n" + ... "arvados_dispatchcloud_containers_time_from_queue_to_crunch_run_seconds{quantile=\"0.99\"} 9.223372036854776e+09\n" + ... "arvados_dispatchcloud_containers_time_from_queue_to_crunch_run_seconds_sum 1.9461314997763523e+12\n" + ... "arvados_dispatchcloud_containers_time_from_queue_to_crunch_run_seconds_count 211\n" + ... "# HELP arvados_dispatchcloud_driver_operations Number of instance-create/destroy/list operations performed via cloud driver.\n" + ... "# TYPE arvados_dispatchcloud_driver_operations counter\n" + ... "arvados_dispatchcloud_driver_operations{error=\"0\",operation=\"Create\"} 53\n" + ... "arvados_dispatchcloud_driver_operations{error=\"0\",operation=\"Destroy\"} 53\n" + ... "arvados_dispatchcloud_driver_operations{error=\"0\",operation=\"List\"} 111\n" + ... "arvados_dispatchcloud_driver_operations{error=\"0\",operation=\"SetTags\"} 25\n" + ... "arvados_dispatchcloud_driver_operations{error=\"1\",operation=\"Create\"} 15\n" + ... "arvados_dispatchcloud_driver_operations{error=\"1\",operation=\"Destroy\"} 7\n" + ... "arvados_dispatchcloud_driver_operations{error=\"1\",operation=\"List\"} 0\n" + ... "arvados_dispatchcloud_driver_operations{error=\"1\",operation=\"SetTags\"} 0\n" + ... "# HELP arvados_dispatchcloud_instances_disappeared Number of occurrences of an instance disappearing from the cloud provider's list of instances.\n" + ... "# TYPE arvados_dispatchcloud_instances_disappeared counter\n" + ... "arvados_dispatchcloud_instances_disappeared{state=\"booting\"} 0\n" + ... "arvados_dispatchcloud_instances_disappeared{state=\"idle\"} 0\n" + ... "arvados_dispatchcloud_instances_disappeared{state=\"running\"} 0\n" + ... "arvados_dispatchcloud_instances_disappeared{state=\"shutdown\"} 52\n" + ... "arvados_dispatchcloud_instances_disappeared{state=\"unknown\"} 0\n" + ... "# HELP arvados_dispatchcloud_instances_price Price of cloud VMs.\n" + ... "# TYPE arvados_dispatchcloud_instances_price gauge\n" + ... "arvados_dispatchcloud_instances_price{category=\"booting\"} 0\n" + ... "arvados_dispatchcloud_instances_price{category=\"hold\"} 0\n" + ... "arvados_dispatchcloud_instances_price{category=\"idle\"} 0.984\n" + ... "arvados_dispatchcloud_instances_price{category=\"inuse\"} 0\n" + ... "arvados_dispatchcloud_instances_price{category=\"unknown\"} 0\n" + ... "# HELP arvados_dispatchcloud_instances_run_probe_duration_seconds Number of seconds per runProbe call.\n" + ... "# TYPE arvados_dispatchcloud_instances_run_probe_duration_seconds summary\n" + ... "arvados_dispatchcloud_instances_run_probe_duration_seconds{outcome=\"fail\",quantile=\"0.5\"} 0.000299431\n" + ... "arvados_dispatchcloud_instances_run_probe_duration_seconds{outcome=\"fail\",quantile=\"0.9\"} 0.00055759\n" + ... "arvados_dispatchcloud_instances_run_probe_duration_seconds{outcome=\"fail\",quantile=\"0.95\"} 0.00072938\n" + ... "arvados_dispatchcloud_instances_run_probe_duration_seconds{outcome=\"fail\",quantile=\"0.99\"} 0.00126165\n" + ... "arvados_dispatchcloud_instances_run_probe_duration_seconds_sum{outcome=\"fail\"} 0.04786191400000002\n" + ... "arvados_dispatchcloud_instances_run_probe_duration_seconds_count{outcome=\"fail\"} 128\n" + ... "arvados_dispatchcloud_instances_run_probe_duration_seconds{outcome=\"success\",quantile=\"0.5\"} 0.000334\n" + ... "arvados_dispatchcloud_instances_run_probe_duration_seconds{outcome=\"success\",quantile=\"0.9\"} 0.00065138\n" + ... "arvados_dispatchcloud_instances_run_probe_duration_seconds{outcome=\"success\",quantile=\"0.95\"} 0.00091094\n" + ... "arvados_dispatchcloud_instances_run_probe_duration_seconds{outcome=\"success\",quantile=\"0.99\"} 0.00174714\n" + ... "arvados_dispatchcloud_instances_run_probe_duration_seconds_sum{outcome=\"success\"} 0.3429287439999997\n" + ... "arvados_dispatchcloud_instances_run_probe_duration_seconds_count{outcome=\"success\"} 876\n" + ... "# HELP arvados_dispatchcloud_instances_time_from_shutdown_request_to_disappearance_seconds Number of seconds between the first shutdown attempt and the disappearance of the worker.\n" + ... "# TYPE arvados_dispatchcloud_instances_time_from_shutdown_request_to_disappearance_seconds summary\n" + ... "arvados_dispatchcloud_instances_time_from_shutdown_request_to_disappearance_seconds{quantile=\"0.5\"} 0.00741143\n" + ... "arvados_dispatchcloud_instances_time_from_shutdown_request_to_disappearance_seconds{quantile=\"0.9\"} 0.01473868\n" + ... "arvados_dispatchcloud_instances_time_from_shutdown_request_to_disappearance_seconds{quantile=\"0.95\"} 0.01628774\n" + ... "arvados_dispatchcloud_instances_time_from_shutdown_request_to_disappearance_seconds{quantile=\"0.99\"} 0.01721516\n" + ... "arvados_dispatchcloud_instances_time_from_shutdown_request_to_disappearance_seconds_sum 0.40030032699999996\n" + ... "arvados_dispatchcloud_instances_time_from_shutdown_request_to_disappearance_seconds_count 52\n" + ... "# HELP arvados_dispatchcloud_instances_time_to_ready_for_container_seconds Number of seconds between the first successful SSH connection and ready to run a container.\n" + ... "# TYPE arvados_dispatchcloud_instances_time_to_ready_for_container_seconds summary\n" + ... "arvados_dispatchcloud_instances_time_to_ready_for_container_seconds{quantile=\"0.5\"} 0.009143631\n" + ... "arvados_dispatchcloud_instances_time_to_ready_for_container_seconds{quantile=\"0.9\"} 0.01485901\n" + ... "arvados_dispatchcloud_instances_time_to_ready_for_container_seconds{quantile=\"0.95\"} 0.01578203\n" + ... "arvados_dispatchcloud_instances_time_to_ready_for_container_seconds{quantile=\"0.99\"} 0.0195959\n" + ... "arvados_dispatchcloud_instances_time_to_ready_for_container_seconds_sum 0.37931822099999996\n" + ... "arvados_dispatchcloud_instances_time_to_ready_for_container_seconds_count 41\n" + ... "# HELP arvados_dispatchcloud_instances_time_to_ssh_seconds Number of seconds between instance creation and the first successful SSH connection.\n" + ... "# TYPE arvados_dispatchcloud_instances_time_to_ssh_seconds summary\n" + ... "arvados_dispatchcloud_instances_time_to_ssh_seconds{quantile=\"0.5\"} 0.0181355\n" + ... "arvados_dispatchcloud_instances_time_to_ssh_seconds{quantile=\"0.9\"} 0.02653959\n" + ... "arvados_dispatchcloud_instances_time_to_ssh_seconds{quantile=\"0.95\"} 0.02768423\n" + ... "arvados_dispatchcloud_instances_time_to_ssh_seconds{quantile=\"0.99\"} 0.03185776\n" + ... "arvados_dispatchcloud_instances_time_to_ssh_seconds_sum 0.9493334640000001\n" + ... "arvados_dispatchcloud_instances_time_to_ssh_seconds_count 50\n" + ... "# HELP arvados_dispatchcloud_instances_total Number of cloud VMs.\n" + ... "# TYPE arvados_dispatchcloud_instances_total gauge\n" + ... "arvados_dispatchcloud_instances_total{category=\"booting\",instance_type=\"type1\"} 0\n" + ... "arvados_dispatchcloud_instances_total{category=\"booting\",instance_type=\"type16\"} 0\n" + ... "arvados_dispatchcloud_instances_total{category=\"booting\",instance_type=\"type2\"} 0\n" + ... "arvados_dispatchcloud_instances_total{category=\"booting\",instance_type=\"type3\"} 0\n" + ... "arvados_dispatchcloud_instances_total{category=\"booting\",instance_type=\"type4\"} 0\n" + ... "arvados_dispatchcloud_instances_total{category=\"booting\",instance_type=\"type6\"} 0\n" + ... "arvados_dispatchcloud_instances_total{category=\"booting\",instance_type=\"type8\"} 0\n" + ... "arvados_dispatchcloud_instances_total{category=\"hold\",instance_type=\"type1\"} 0\n" + ... "arvados_dispatchcloud_instances_total{category=\"hold\",instance_type=\"type16\"} 0\n" + ... "arvados_dispatchcloud_instances_total{category=\"hold\",instance_type=\"type2\"} 0\n" + ... "arvados_dispatchcloud_instances_total{category=\"hold\",instance_type=\"type3\"} 0\n" + ... "arvados_dispatchcloud_instances_total{category=\"hold\",instance_type=\"type4\"} 0\n" + ... "arvados_dispatchcloud_instances_total{category=\"hold\",instance_type=\"type6\"} 0\n" + ... "arvados_dispatchcloud_instances_total{category=\"hold\",instance_type=\"type8\"} 0\n" + ... "arvados_dispatchcloud_instances_total{category=\"idle\",instance_type=\"type1\"} 0\n" + ... "arvados_dispatchcloud_instances_total{category=\"idle\",instance_type=\"type16\"} 0\n" + ... "arvados_dispatchcloud_instances_total{category=\"idle\",instance_type=\"type2\"} 0\n" + ... "arvados_dispatchcloud_instances_total{category=\"idle\",instance_type=\"type3\"} 0\n" + ... "arvados_dispatchcloud_instances_total{category=\"idle\",instance_type=\"type4\"} 0\n" + ... "arvados_dispatchcloud_instances_total{category=\"idle\",instance_type=\"type6\"} 0\n" + ... "arvados_dispatchcloud_instances_total{category=\"idle\",instance_type=\"type8\"} 1\n" + ... "arvados_dispatchcloud_instances_total{category=\"inuse\",instance_type=\"type1\"} 0\n" + ... "arvados_dispatchcloud_instances_total{category=\"inuse\",instance_type=\"type16\"} 0\n" + ... "arvados_dispatchcloud_instances_total{category=\"inuse\",instance_type=\"type2\"} 0\n" + ... "arvados_dispatchcloud_instances_total{category=\"inuse\",instance_type=\"type3\"} 0\n" + ... "arvados_dispatchcloud_instances_total{category=\"inuse\",instance_type=\"type4\"} 0\n" + ... "arvados_dispatchcloud_instances_total{category=\"inuse\",instance_type=\"type6\"} 0\n" + ... "arvados_dispatchcloud_instances_total{category=\"inuse\",instance_type=\"type8\"} 0\n" + ... "arvados_dispatchcloud_instances_total{category=\"unknown\",instance_type=\"type1\"} 0\n" + ... "arvados_dispatchcloud_instances_total{category=\"unknown\",instance_type=\"type16\"} 0\n" + ... "arvados_dispatchcloud_instances_total{category=\"unknown\",instance_type=\"type2\"} 0\n" + ... "arvados_dispatchcloud_instances_total{category=\"unknown\",instance_type=\"type3\"} 0\n" + ... "arvados_dispatchcloud_instances_total{category=\"unknown\",instance_type=\"type4\"} 0\n" + ... "arvados_dispatchcloud_instances_total{category=\"unknown\",instance_type=\"type6\"} 0\n" + ... "arvados_dispatchcloud_instances_total{category=\"unknown\",instance_type=\"type8\"} 0\n" + ... "# HELP arvados_dispatchcloud_last_503_time Time of most recent 503 error received from API.\n" + ... "# TYPE arvados_dispatchcloud_last_503_time gauge\n" + ... "arvados_dispatchcloud_last_503_time 1.703100514e+09\n" + ... "# HELP arvados_dispatchcloud_max_concurrent_containers Dynamically assigned limit on number of containers scheduled concurrency, set after receiving 503 errors from API.\n" + ... "# TYPE arvados_dispatchcloud_max_concurrent_containers gauge\n" + ... "arvados_dispatchcloud_max_concurrent_containers 5\n" + ... "# HELP arvados_dispatchcloud_memory_bytes_total Total memory on all cloud VMs.\n" + ... "# TYPE arvados_dispatchcloud_memory_bytes_total gauge\n" + ... "arvados_dispatchcloud_memory_bytes_total{category=\"booting\"} 0\n" + ... "arvados_dispatchcloud_memory_bytes_total{category=\"hold\"} 0\n" + ... "arvados_dispatchcloud_memory_bytes_total{category=\"idle\"} 8.589934592e+09\n" + ... "arvados_dispatchcloud_memory_bytes_total{category=\"inuse\"} 0\n" + ... "arvados_dispatchcloud_memory_bytes_total{category=\"unknown\"} 0\n" + ... "# HELP arvados_dispatchcloud_probe_age_seconds_max Maximum number of seconds since an instance's most recent successful probe.\n" + ... "# TYPE arvados_dispatchcloud_probe_age_seconds_max gauge\n" + ... "arvados_dispatchcloud_probe_age_seconds_max 0.01719935\n" + ... "# HELP arvados_dispatchcloud_probe_age_seconds_median Median number of seconds since an instance's most recent successful probe.\n" + ... "# TYPE arvados_dispatchcloud_probe_age_seconds_median gauge\n" + ... "arvados_dispatchcloud_probe_age_seconds_median 0.01719935\n" + ... "# HELP arvados_dispatchcloud_vcpus_total Total VCPUs on all cloud VMs.\n" + ... "# TYPE arvados_dispatchcloud_vcpus_total gauge\n" + ... "arvados_dispatchcloud_vcpus_total{category=\"booting\"} 0\n" + ... "arvados_dispatchcloud_vcpus_total{category=\"hold\"} 0\n" + ... "arvados_dispatchcloud_vcpus_total{category=\"idle\"} 8\n" + ... "arvados_dispatchcloud_vcpus_total{category=\"inuse\"} 0\n" + ... "arvados_dispatchcloud_vcpus_total{category=\"unknown\"} 0\n" ... regex string = "(?ms).*boot_outcomes{outcome=\"failure\"} [^0].*" time="2023-12-20T19:28:35.182068484Z" level=info msg="instance disappeared in cloud" Instance="inst46,providertype8" WorkerState=shutdown OOPS: 11 passed, 1 FAILED --- FAIL: Test (1.25s) FAIL git.arvados.org/arvados.git/lib/dispatchcloud coverage: 75.1% of statements FAIL git.arvados.org/arvados.git/lib/dispatchcloud 1.271s FAIL
Failure mode 1¶
FAIL: dispatcher_test.go:160: DispatcherSuite.TestDispatchToStubDriver [way too many logs omitted, see attachment] dispatcher_test.go:296: c.Check(resp.Body.String(), check.Matches, `(?ms).*boot_outcomes{outcome="failure"} [^0].*`) ... value string = "" + ... "# HELP arvados_dispatchcloud_at_quota Flag indicating the cloud driver is reporting an at-quota condition.\n" + ... "# TYPE arvados_dispatchcloud_at_quota gauge\n" + ... "arvados_dispatchcloud_at_quota 0\n" + ... "# HELP arvados_dispatchcloud_boot_outcomes Boot outcomes by type.\n" + ... "# TYPE arvados_dispatchcloud_boot_outcomes counter\n" + ... "arvados_dispatchcloud_boot_outcomes{outcome=\"aborted\"} 9\n" + ... "arvados_dispatchcloud_boot_outcomes{outcome=\"disappeared\"} 4\n" + ... "arvados_dispatchcloud_boot_outcomes{outcome=\"failure\"} 0\n" + ... "arvados_dispatchcloud_boot_outcomes{outcome=\"success\"} 49\n" + ... "# HELP arvados_dispatchcloud_containers_allocated_not_started Number of containers allocated to a worker but not started yet (worker is booting).\n" + ... "# TYPE arvados_dispatchcloud_containers_allocated_not_started gauge\n" + ... "arvados_dispatchcloud_containers_allocated_not_started 0\n" + ... "# HELP arvados_dispatchcloud_containers_longest_wait_time_seconds Current longest wait time of any container since queuing, and before the start of crunch-run.\n" + ... "# TYPE arvados_dispatchcloud_containers_longest_wait_time_seconds gauge\n" + ... "arvados_dispatchcloud_containers_longest_wait_time_seconds 0\n" + ... "# HELP arvados_dispatchcloud_containers_not_allocated_over_quota Number of containers not allocated to a worker because the system has hit a quota.\n" + ... "# TYPE arvados_dispatchcloud_containers_not_allocated_over_quota gauge\n" + ... "arvados_dispatchcloud_containers_not_allocated_over_quota 0\n" + ... "# HELP arvados_dispatchcloud_containers_running Number of containers reported running by cloud VMs.\n" + ... "# TYPE arvados_dispatchcloud_containers_running gauge\n" + ... "arvados_dispatchcloud_containers_running 0\n" + ... "# HELP arvados_dispatchcloud_containers_time_from_queue_to_crunch_run_seconds Number of seconds between the queuing of a container and the start of crunch-run.\n" + ... "# TYPE arvados_dispatchcloud_containers_time_from_queue_to_crunch_run_seconds summary\n" + ... "arvados_dispatchcloud_containers_time_from_queue_to_crunch_run_seconds{quantile=\"0.5\"} 9.223372036854776e+09\n" + ... "arvados_dispatchcloud_containers_time_from_queue_to_crunch_run_seconds{quantile=\"0.9\"} 9.223372036854776e+09\n" + ... "arvados_dispatchcloud_containers_time_from_queue_to_crunch_run_seconds{quantile=\"0.95\"} 9.223372036854776e+09\n" + ... "arvados_dispatchcloud_containers_time_from_queue_to_crunch_run_seconds{quantile=\"0.99\"} 9.223372036854776e+09\n" + ... "arvados_dispatchcloud_containers_time_from_queue_to_crunch_run_seconds_sum 1.9830249879237712e+12\n" + ... "arvados_dispatchcloud_containers_time_from_queue_to_crunch_run_seconds_count 215\n" + ... "# HELP arvados_dispatchcloud_driver_operations Number of instance-create/destroy/list operations performed via cloud driver.\n" + ... "# TYPE arvados_dispatchcloud_driver_operations counter\n" + ... "arvados_dispatchcloud_driver_operations{error=\"0\",operation=\"Create\"} 62\n" + ... "arvados_dispatchcloud_driver_operations{error=\"0\",operation=\"Destroy\"} 62\n" + ... "arvados_dispatchcloud_driver_operations{error=\"0\",operation=\"List\"} 103\n" + ... "arvados_dispatchcloud_driver_operations{error=\"0\",operation=\"SetTags\"} 21\n" + ... "arvados_dispatchcloud_driver_operations{error=\"1\",operation=\"Create\"} 13\n" + ... "arvados_dispatchcloud_driver_operations{error=\"1\",operation=\"Destroy\"} 9\n" + ... "arvados_dispatchcloud_driver_operations{error=\"1\",operation=\"List\"} 0\n" + ... "arvados_dispatchcloud_driver_operations{error=\"1\",operation=\"SetTags\"} 0\n" + ... "# HELP arvados_dispatchcloud_instances_disappeared Number of occurrences of an instance disappearing from the cloud provider's list of instances.\n" + ... "# TYPE arvados_dispatchcloud_instances_disappeared counter\n" + ... "arvados_dispatchcloud_instances_disappeared{state=\"booting\"} 0\n" + ... "arvados_dispatchcloud_instances_disappeared{state=\"idle\"} 0\n" + ... "arvados_dispatchcloud_instances_disappeared{state=\"running\"} 0\n" + ... "arvados_dispatchcloud_instances_disappeared{state=\"shutdown\"} 62\n" + ... "arvados_dispatchcloud_instances_disappeared{state=\"unknown\"} 0\n" + ... "# HELP arvados_dispatchcloud_instances_price Price of cloud VMs.\n" + ... "# TYPE arvados_dispatchcloud_instances_price gauge\n" + ... "arvados_dispatchcloud_instances_price{category=\"booting\"} 0\n" + ... "arvados_dispatchcloud_instances_price{category=\"hold\"} 0\n" + ... "arvados_dispatchcloud_instances_price{category=\"idle\"} 0\n" + ... "arvados_dispatchcloud_instances_price{category=\"inuse\"} 0\n" + ... "arvados_dispatchcloud_instances_price{category=\"unknown\"} 0\n" + ... "# HELP arvados_dispatchcloud_instances_run_probe_duration_seconds Number of seconds per runProbe call.\n" + ... "# TYPE arvados_dispatchcloud_instances_run_probe_duration_seconds summary\n" + ... "arvados_dispatchcloud_instances_run_probe_duration_seconds{outcome=\"fail\",quantile=\"0.5\"} 0.00025819\n" + ... "arvados_dispatchcloud_instances_run_probe_duration_seconds{outcome=\"fail\",quantile=\"0.9\"} 0.00049002\n" + ... "arvados_dispatchcloud_instances_run_probe_duration_seconds{outcome=\"fail\",quantile=\"0.95\"} 0.00068864\n" + ... "arvados_dispatchcloud_instances_run_probe_duration_seconds{outcome=\"fail\",quantile=\"0.99\"} 0.0016936\n" + ... "arvados_dispatchcloud_instances_run_probe_duration_seconds_sum{outcome=\"fail\"} 0.04495730099999998\n" + ... "arvados_dispatchcloud_instances_run_probe_duration_seconds_count{outcome=\"fail\"} 138\n" + ... "arvados_dispatchcloud_instances_run_probe_duration_seconds{outcome=\"success\",quantile=\"0.5\"} 0.00025914\n" + ... "arvados_dispatchcloud_instances_run_probe_duration_seconds{outcome=\"success\",quantile=\"0.9\"} 0.00051182\n" + ... "arvados_dispatchcloud_instances_run_probe_duration_seconds{outcome=\"success\",quantile=\"0.95\"} 0.00081387\n" + ... "arvados_dispatchcloud_instances_run_probe_duration_seconds{outcome=\"success\",quantile=\"0.99\"} 0.001874889\n" + ... "arvados_dispatchcloud_instances_run_probe_duration_seconds_sum{outcome=\"success\"} 0.2827972210000001\n" + ... "arvados_dispatchcloud_instances_run_probe_duration_seconds_count{outcome=\"success\"} 843\n" + ... "# HELP arvados_dispatchcloud_instances_time_from_shutdown_request_to_disappearance_seconds Number of seconds between the first shutdown attempt and the disappearance of the worker.\n" + ... "# TYPE arvados_dispatchcloud_instances_time_from_shutdown_request_to_disappearance_seconds summary\n" + ... "arvados_dispatchcloud_instances_time_from_shutdown_request_to_disappearance_seconds{quantile=\"0.5\"} 0.009525099\n" + ... "arvados_dispatchcloud_instances_time_from_shutdown_request_to_disappearance_seconds{quantile=\"0.9\"} 0.016427759\n" + ... "arvados_dispatchcloud_instances_time_from_shutdown_request_to_disappearance_seconds{quantile=\"0.95\"} 0.017818548\n" + ... "arvados_dispatchcloud_instances_time_from_shutdown_request_to_disappearance_seconds{quantile=\"0.99\"} 0.018422408\n" + ... "arvados_dispatchcloud_instances_time_from_shutdown_request_to_disappearance_seconds_sum 0.5827228839999999\n" + ... "arvados_dispatchcloud_instances_time_from_shutdown_request_to_disappearance_seconds_count 62\n" + ... "# HELP arvados_dispatchcloud_instances_time_to_ready_for_container_seconds Number of seconds between the first successful SSH connection and ready to run a container.\n" + ... "# TYPE arvados_dispatchcloud_instances_time_to_ready_for_container_seconds summary\n" + ... "arvados_dispatchcloud_instances_time_to_ready_for_container_seconds{quantile=\"0.5\"} 0.007698769\n" + ... "arvados_dispatchcloud_instances_time_to_ready_for_container_seconds{quantile=\"0.9\"} 0.018579038\n" + ... "arvados_dispatchcloud_instances_time_to_ready_for_container_seconds{quantile=\"0.95\"} 0.021335438\n" + ... "arvados_dispatchcloud_instances_time_to_ready_for_container_seconds{quantile=\"0.99\"} 0.036827616\n" + ... "arvados_dispatchcloud_instances_time_to_ready_for_container_seconds_sum 0.4455547920000001\n" + ... "arvados_dispatchcloud_instances_time_to_ready_for_container_seconds_count 49\n" + ... "# HELP arvados_dispatchcloud_instances_time_to_ssh_seconds Number of seconds between instance creation and the first successful SSH connection.\n" + ... "# TYPE arvados_dispatchcloud_instances_time_to_ssh_seconds summary\n" + ... "arvados_dispatchcloud_instances_time_to_ssh_seconds{quantile=\"0.5\"} 0.017542368\n" + ... "arvados_dispatchcloud_instances_time_to_ssh_seconds{quantile=\"0.9\"} 0.026203417\n" + ... "arvados_dispatchcloud_instances_time_to_ssh_seconds{quantile=\"0.95\"} 0.028872557\n" + ... "arvados_dispatchcloud_instances_time_to_ssh_seconds{quantile=\"0.99\"} 0.033338397\n" + ... "arvados_dispatchcloud_instances_time_to_ssh_seconds_sum 1.1238655879999997\n" + ... "arvados_dispatchcloud_instances_time_to_ssh_seconds_count 60\n" + ... "# HELP arvados_dispatchcloud_instances_total Number of cloud VMs.\n" + ... "# TYPE arvados_dispatchcloud_instances_total gauge\n" + ... "arvados_dispatchcloud_instances_total{category=\"booting\",instance_type=\"type1\"} 0\n" + ... "arvados_dispatchcloud_instances_total{category=\"booting\",instance_type=\"type16\"} 0\n" + ... "arvados_dispatchcloud_instances_total{category=\"booting\",instance_type=\"type2\"} 0\n" + ... "arvados_dispatchcloud_instances_total{category=\"booting\",instance_type=\"type3\"} 0\n" + ... "arvados_dispatchcloud_instances_total{category=\"booting\",instance_type=\"type4\"} 0\n" + ... "arvados_dispatchcloud_instances_total{category=\"booting\",instance_type=\"type6\"} 0\n" + ... "arvados_dispatchcloud_instances_total{category=\"booting\",instance_type=\"type8\"} 0\n" + ... "arvados_dispatchcloud_instances_total{category=\"hold\",instance_type=\"type1\"} 0\n" + ... "arvados_dispatchcloud_instances_total{category=\"hold\",instance_type=\"type16\"} 0\n" + ... "arvados_dispatchcloud_instances_total{category=\"hold\",instance_type=\"type2\"} 0\n" + ... "arvados_dispatchcloud_instances_total{category=\"hold\",instance_type=\"type3\"} 0\n" + ... "arvados_dispatchcloud_instances_total{category=\"hold\",instance_type=\"type4\"} 0\n" + ... "arvados_dispatchcloud_instances_total{category=\"hold\",instance_type=\"type6\"} 0\n" + ... "arvados_dispatchcloud_instances_total{category=\"hold\",instance_type=\"type8\"} 0\n" + ... "arvados_dispatchcloud_instances_total{category=\"idle\",instance_type=\"type1\"} 0\n" + ... "arvados_dispatchcloud_instances_total{category=\"idle\",instance_type=\"type16\"} 0\n" + ... "arvados_dispatchcloud_instances_total{category=\"idle\",instance_type=\"type2\"} 1\n" + ... "arvados_dispatchcloud_instances_total{category=\"idle\",instance_type=\"type3\"} 0\n" + ... "arvados_dispatchcloud_instances_total{category=\"idle\",instance_type=\"type4\"} 0\n" + ... "arvados_dispatchcloud_instances_total{category=\"idle\",instance_type=\"type6\"} 0\n" + ... "arvados_dispatchcloud_instances_total{category=\"idle\",instance_type=\"type8\"} 0\n" + ... "arvados_dispatchcloud_instances_total{category=\"inuse\",instance_type=\"type1\"} 0\n" + ... "arvados_dispatchcloud_instances_total{category=\"inuse\",instance_type=\"type16\"} 0\n" + ... "arvados_dispatchcloud_instances_total{category=\"inuse\",instance_type=\"type2\"} 0\n" + ... "arvados_dispatchcloud_instances_total{category=\"inuse\",instance_type=\"type3\"} 0\n" + ... "arvados_dispatchcloud_instances_total{category=\"inuse\",instance_type=\"type4\"} 0\n" + ... "arvados_dispatchcloud_instances_total{category=\"inuse\",instance_type=\"type6\"} 0\n" + ... "arvados_dispatchcloud_instances_total{category=\"inuse\",instance_type=\"type8\"} 0\n" + ... "arvados_dispatchcloud_instances_total{category=\"unknown\",instance_type=\"type1\"} 0\n" + ... "arvados_dispatchcloud_instances_total{category=\"unknown\",instance_type=\"type16\"} 0\n" + ... "arvados_dispatchcloud_instances_total{category=\"unknown\",instance_type=\"type2\"} 0\n" + ... "arvados_dispatchcloud_instances_total{category=\"unknown\",instance_type=\"type3\"} 0\n" + ... "arvados_dispatchcloud_instances_total{category=\"unknown\",instance_type=\"type4\"} 0\n" + ... "arvados_dispatchcloud_instances_total{category=\"unknown\",instance_type=\"type6\"} 0\n" + ... "arvados_dispatchcloud_instances_total{category=\"unknown\",instance_type=\"type8\"} 0\n" + ... "# HELP arvados_dispatchcloud_last_503_time Time of most recent 503 error received from API.\n" + ... "# TYPE arvados_dispatchcloud_last_503_time gauge\n" + ... "arvados_dispatchcloud_last_503_time 1.701292139e+09\n" + ... "# HELP arvados_dispatchcloud_max_concurrent_containers Dynamically assigned limit on number of containers scheduled concurrency, set after receiving 503 errors from API.\n" + ... "# TYPE arvados_dispatchcloud_max_concurrent_containers gauge\n" + ... "arvados_dispatchcloud_max_concurrent_containers 6\n" + ... "# HELP arvados_dispatchcloud_memory_bytes_total Total memory on all cloud VMs.\n" + ... "# TYPE arvados_dispatchcloud_memory_bytes_total gauge\n" + ... "arvados_dispatchcloud_memory_bytes_total{category=\"booting\"} 0\n" + ... "arvados_dispatchcloud_memory_bytes_total{category=\"hold\"} 0\n" + ... "arvados_dispatchcloud_memory_bytes_total{category=\"idle\"} 0\n" + ... "arvados_dispatchcloud_memory_bytes_total{category=\"inuse\"} 0\n" + ... "arvados_dispatchcloud_memory_bytes_total{category=\"unknown\"} 0\n" + ... "# HELP arvados_dispatchcloud_probe_age_seconds_max Maximum number of seconds since an instance's most recent successful probe.\n" + ... "# TYPE arvados_dispatchcloud_probe_age_seconds_max gauge\n" + ... "arvados_dispatchcloud_probe_age_seconds_max 0\n" + ... "# HELP arvados_dispatchcloud_probe_age_seconds_median Median number of seconds since an instance's most recent successful probe.\n" + ... "# TYPE arvados_dispatchcloud_probe_age_seconds_median gauge\n" + ... "arvados_dispatchcloud_probe_age_seconds_median 0\n" + ... "# HELP arvados_dispatchcloud_vcpus_total Total VCPUs on all cloud VMs.\n" + ... "# TYPE arvados_dispatchcloud_vcpus_total gauge\n" + ... "arvados_dispatchcloud_vcpus_total{category=\"booting\"} 0\n" + ... "arvados_dispatchcloud_vcpus_total{category=\"hold\"} 0\n" + ... "arvados_dispatchcloud_vcpus_total{category=\"idle\"} 0\n" + ... "arvados_dispatchcloud_vcpus_total{category=\"inuse\"} 0\n" + ... "arvados_dispatchcloud_vcpus_total{category=\"unknown\"} 0\n" ... regex string = "(?ms).*boot_outcomes{outcome=\"failure\"} [^0].*"
Files
Actions