Story #13937

[keepstore] Move legacy metrics to prometheus /metrics endpoint

Added by Tom Clegg over 1 year ago. Updated 11 months ago.

Assigned To:
Target version:
Start date:
Due date:
% Done:


Estimated time:
(Total: 0.00 h)
Story points:
Release relationship:


Currently, keepstore exports some metrics in /status.json. These predate keepstore's prometheus /metrics endpoint.
  • BufferPool.BuffersAllocatedCumulative, .BuffersMax, .BuffersInUse
  • PullQueue.InProgress, .Queued
  • TrashQueue.InProgress, .Queued
  • RequestsCurrent
  • RequestsMax
  • Volumes[]
    • Label
    • Status.MountPoint, .DeviceNum, .BytesFree, .BytesUsed
    • VolumeStats.Errors, .Ops, .CompareOps, .GetOps, .PutOps, .TouchOps, .InBytes, .OutBytes
    • InternalStats.* (driver-dependent)

These should be exported as prometheus metrics instead.

There should be a transition period where metrics are exported at both /status.json and /metrics, so ops have an opportunity to migrate their monitoring/dashboard systems.


Task #14668: Review 13937-keepstore-prometheusResolvedLucas Di Pentima

Associated revisions

Revision 53843c61
Added by Lucas Di Pentima 11 months ago

Merge branch '13937-keepstore-prometheus'
Closes #13937

Arvados-DCO-1.1-Signed-off-by: Lucas Di Pentima <>


#1 Updated by Peter Amstutz over 1 year ago needs to be updated as well.

#2 Updated by Tom Morris over 1 year ago

  • Tracker changed from Bug to Story
  • Target version set to To Be Groomed

#3 Updated by Tom Morris about 1 year ago

  • Story points set to 2.0

#4 Updated by Tom Morris about 1 year ago

  • Target version changed from To Be Groomed to Arvados Future Sprints

#5 Updated by Tom Morris about 1 year ago

  • Target version changed from Arvados Future Sprints to 2019-01-16 Sprint

#6 Updated by Lucas Di Pentima about 1 year ago

  • Assigned To set to Lucas Di Pentima

#7 Updated by Lucas Di Pentima about 1 year ago

  • Status changed from New to In Progress

#8 Updated by Lucas Di Pentima about 1 year ago

  • Target version changed from 2019-01-16 Sprint to 2019-01-30 Sprint

#9 Updated by Lucas Di Pentima 12 months ago

  • Target version changed from 2019-01-30 Sprint to 2019-02-13 Sprint

#10 Updated by Lucas Di Pentima 12 months ago

Updates at f0553505e - branch 13937-keepstore-prometheus (WIP)

  • Exports almost all requested stats. Pending: InternalStats ErrorCodes (from statsTicker). Not sure how to implement this because I believe the number of keys would change at runtime and the metrics registry is populated once at setup time. Should I try to register all ErrorCodes keys whenever the /metrics endpoint is hit? Not sure if this is the correct approach.

Also pending: Docs updates & s3, azure driver stats

#11 Updated by Tom Clegg 12 months ago

I think CounterVec is what we're looking for here. TickErr() would look like this:


However, I'm not sure we can use this with the ConstLabels/NewGaugeFunc approach.

I'm not sure we should be using our existing ad hoc metrics as a shim here. I think it would be neater to create CounterVecs on the outside, curry them with the volume label, and pass them in to the volumes (perhaps an argument to Start()?) so the volumes can update the gauges/counters directly. Then we could delete the ad hoc metrics code entirely instead of keeping it around as an extra layer.

#12 Updated by Lucas Di Pentima 11 months ago

  • Target version changed from 2019-02-13 Sprint to 2019-02-27 Sprint

#13 Updated by Lucas Di Pentima 11 months ago

  • Story points changed from 2.0 to 1.0

#14 Updated by Lucas Di Pentima 11 months ago

Updates at f50aff88c

Refactored the previous approach to use CounterVec created within the theConfig.Start() call and passed with predefined labels to every volume .Start() call.
Also added a way for drivers to register their specific counters.

This runs ok on arvbox, but there're several tests that fail with segfaults. I would like to be sure that this approach is correct and if I should check for the metrics to be non-nil before trying to .Inc() them or some pointer on how I should be fixing those tests, as some of them don't call Start() on the volume being tested, so the metrics aren't initialized.

#15 Updated by Tom Clegg 11 months ago

This is better but I think we need to use CounterVec more to reduce complexity/repetitiveness.

For example, rather than implement our own curryWith() and getInternalCounter(), the router could have just three CounterVecs for volume metrics:
  • CounterVec "volume_operations" with labels {device_id, operation}
  • CounterVec "volume_errors" with labels {device_id, error_type}
  • CounterVec "volume_io_bytes" with labels {device_id, direction}

When starting a volume, instead of passing the prometheus registry, we would only need to pass those two CounterVecs. (We could almost curry the first label and pass a one-label vector to Start() -- however, at least in Azure's case, it looks like DeviceID() doesn't work correctly before Start() is called.)

Then, Start() would look something like this:

func (v *vol) Start(opsCounters, errCounters, ioCounters *prometheus.CounterVec) error {
        // ...
        lbls := prometheus.Labels{"device_id": v.DeviceID()}
        v.opsCounters = opsCounters.MustCurryWith(lbls)
        v.errCounters = errCounters.MustCurryWith(lbls)
        v.ioCounters = ioCounters.MustCurryWith(lbls)

Then, to use the counters:

        v.errCounters.With(prometheus.Labels{"type": errType}).Inc()
        v.ioCounters.With(prometheus.Labels{"direction": "in"}).Add(n)

I think we should just drop the "bytes used/free" gauges. Prometheus already includes a node_exporter that provides those metrics for local filesystems, and we only report dummy numbers for cloud storage volumes.

The metrics should count backend volume operations, not volume func calls. For example, instead of counting "Compare" calls, we should count "read object" and "write object". In the case of (*UnixVolume)Get(), the call stack goes Get → getWithPipe → ReadBlock → getFunc → (*osWithStats)Open → os.Open -- and then ...getFunc → countingReadWriter. We only need to count os.Open calls and I/O traffic. Rather than add new func call metrics, we just need to update the points where we already count. Something like

-        return fn(NewCountingReader(ioutil.NopCloser(f), v.os.stats.TickInBytes))
+        return fn(NewCountingReader(ioutil.NopCloser(f), v.ioCounters.With(prometheus.Labels{"direction": "in"}).Add))
-        o.stats.Tick(&o.stats.OpenOps)
+        o.opsCounters.With(prometheus.Labels{"operation": "open"}).Inc()

I don't think we need to add vector dimensions/labels for device number, volume name, etc.

#16 Updated by Lucas Di Pentima 11 months ago

  • Target version changed from 2019-02-27 Sprint to 2019-03-13 Sprint

#17 Updated by Lucas Di Pentima 11 months ago

Updates at 92d1b6e05
Test run:

Following above suggestions I completed the implementations on unix, s3 & azure volume drivers.
If it looks good, I'll continue with tests.

#18 Updated by Tom Morris 11 months ago

  • Release set to 15

#19 Updated by Tom Clegg 11 months ago

Yes, I think this is looking good, thanks.

Nit: it looks like we could pass around a single volumeMetricsVecs object in Start() calls (and the statsTicker struct) instead of repeating the list of errCounters, opsCounters, ioBytes.

Prometheus recommends naming metrics with the unit at the end, so maybe
  • bufferpool_bytes_allocated → bufferpool_allocated_bytes
  • bufferpool_buffers_max → bufferpool_max_buffers
  • bufferpool_buffers_in_use → bufferpool_inuse_buffers
  • %s_queue_in_progress → %s_queue_inprogress_entries
  • %s_queue_queued → %s_queue_pending_entries
  • requests_current → concurrent_requests
  • requests_max → max_concurrent_requests

(I'm looking at the node_exporter metrics for examples, but not sure how much time we really want to spend tweaking names...)

#20 Updated by Lucas Di Pentima 11 months ago

Updates at 8660244c5
Test run:

  • Addressed simplification & naming suggestions above.
  • Added tests for opsCounters & ioBytes on the generic volume tests suite.

#21 Updated by Tom Clegg 11 months ago

I'm not sure about the "any" counter. It causes sum(ops) to double-count the total backend ops. Even though that's not necessarily a meaningful number, I wonder if we can solve the testing problem without affecting the real metrics at all. CounterVec doesn't let us enumerate label values directly, but perhaps we can export all metrics from a registry and parse it just enough to get the list of counters the driver is using -- something like ?

It seems like GetMetricsVecs() isn't really needed -- shouldn't testMetrics() be testing that the volume is updating vm.ioBytes using {"device_id": v.DeviceID()}, instead of asking the volume which CounterVecs it's updating?

#22 Updated by Lucas Di Pentima 11 months ago

Updates at f4b7e7bf5
Test run:

Addressed above suggestions:
  • Don't pass around data unnecessarily
  • Don't contaminate the real metrics & get read/write operation label values from TestableVolume specific funcs.

#23 Updated by Tom Clegg 11 months ago

LGTM, thanks!

#24 Updated by Lucas Di Pentima 11 months ago

  • Status changed from In Progress to Resolved
  • % Done changed from 0 to 100

Also available in: Atom PDF