Bug #10543

implement approximate (estimated) counts for API list method

Added by Joshua Randall almost 3 years ago. Updated almost 2 years ago.

Status:
New
Priority:
Normal
Assigned To:
-
Category:
-
Target version:
Start date:
11/16/2016
Due date:
% Done:

0%

Estimated time:
Story points:
-

Description

Implement a -count=estimate option for API list queries, to return an estimated/approximate row count in `items_available` rather than the exact count (or no count, as the option 'none' introduced in #9998 allows).

Postgres has a simple way of getting an approximate row count for an entire table very quickly, and a somewhat more involved way of getting an approximate count for more sophisticated queries (https://wiki.postgresql.org/wiki/Count_estimate), which should still be much faster than a full table scan.

This could be used anywhere only an approximate count is needed. That could include:
- to populate a UI that displays the number of pages available rather than the count
- to populate a UI that displays the number of items available in approximate terms (i.e. instead of showing "Data Collections (7323212)" workbench could say "Data Collections (7.3M)")
- to create an appropriately sized data structure to accommodate all the data (e.g. to set the collection map size at the beginning of a keep-balance run, which already uses 110% of the returned value)

History

#1 Updated by Tom Morris almost 2 years ago

  • Target version set to Arvados Future Sprints

Also available in: Atom PDF