Project

General

Profile

Actions

Feature #17994

closed

[api] storage class fields should be supported in filters

Added by Ward Vandewege over 2 years ago. Updated over 2 years ago.

Status:
Resolved
Priority:
Normal
Assigned To:
Category:
-
Target version:
Story points:
-
Release relationship:
Auto

Description

The storage class fields `storage_classes_confirmed` and `storage_classes_desired` are currently not supported as filter attributes (e.g. for use with the cli tools).

It would be useful to change that. This would allow an admin to get a list of collections that are confirmed (or desired) for a particular storage class. Such a list can be used as input to the `deduplication-report`, so that report could then be generated for a particular (set of) storage class(es).

This would also make it possible to create a filter group for a specific (set of) storage class(es).


Subtasks 1 (0 open1 closed)

Task #18043: Review 17994-filter-by-storage-classesResolvedTom Clegg08/27/2021Actions

Related issues

Related to Arvados - Feature #17993: [deduplication-report] supports storage classesNewActions
Related to Arvados - Idea #17697: Design for reporting tools to determine what data is on multiple storage classes.ResolvedWard VandewegeActions
Related to Arvados - Feature #17995: [api] add method to get collections where replication_confirmed < replication_desiredResolvedTom Clegg08/27/2021Actions
Blocks Arvados Epics - Idea #16107: Storage classesResolved03/01/202109/30/2021Actions
Actions #1

Updated by Ward Vandewege over 2 years ago

  • Description updated (diff)
  • Subject changed from [api] storage class fields should be supported in our filters to [api] storage class fields should be supported in filters
Actions #2

Updated by Ward Vandewege over 2 years ago

  • Related to Feature #17993: [deduplication-report] supports storage classes added
Actions #3

Updated by Ward Vandewege over 2 years ago

  • Related to Idea #17697: Design for reporting tools to determine what data is on multiple storage classes. added
Actions #4

Updated by Ward Vandewege over 2 years ago

Actions #5

Updated by Peter Amstutz over 2 years ago

  • Target version set to 2021-09-01 sprint
  • Assigned To set to Tom Clegg
Actions #6

Updated by Tom Clegg over 2 years ago

  • Status changed from New to In Progress
Actions #7

Updated by Tom Clegg over 2 years ago

17994-filter-by-storage-classes @ 902f8cd258a8dfec749a7f94d478a4027e319750 -- developer-run-tests: #2649

So far this is a minimal implementation, it accepts filters like [["storage_classes_desired", "=", "[\"default\"]"]] -- note the operand is the JSON representation, as it's stored in the database.

But we probably want these, too:
  • [["storage_classes_desired", "=", ["default"]]] (alternative syntax equivalent to "[\"default\"]")
  • [["storage_classes_desired", "contains", ["default"]]] (matches ["foo","default"] as well as exact match ["default"])

Currently, https://doc.arvados.org/main/api/methods.html uses ["foo", "contains", "bar"] as the example for "contains", which is a bit misleading since "contains" only works if the first element is "attr.key" where "attr" is a jsonb object column and "key" is a top-level key in the json object. (Should change "foo" to "properties.foo" to make it more clear, I think.)

I'm thinking we could extend that to match json objects/arrays at the top level too, so
  • ["properties", "contains", ["foo", "bar"]] matches a record with {"foo": 1, "bar": 2, "baz": 3}
  • ["storage_classes_desired", "contains", ["foo", "bar"]] matches a record with ["bar", "foo", "default"].

The storage_classes_* fields aren't indexed. Practically speaking this might be okay -- there are typically very few classes with lots of collections in each, and if a condition matches a large portion of the table, an index doesn't save much time.

Actions #8

Updated by Peter Amstutz over 2 years ago

  • Related to Feature #17995: [api] add method to get collections where replication_confirmed < replication_desired added
Actions #9

Updated by Tom Clegg over 2 years ago

17994-filter-by-storage-classes @ 402e69f6e55dce4e11d354c3ca708b8e536c124b -- developer-run-tests: #2651

  • accepts ["storage_classes_confirmed", "contains", ["key1", "key2", ...]] (works on any jsonb column)
  • accepts ["storage_classes_confirmed", "contains", "key1"]
  • reverts adding storage_classes_* to searchable_attributes on collection model (this caused "any" to try to match those columns, which seems undesirable and would require migrating the huge multi-column table index)
  • accepts "=", "<>", "!=" operators on jsonb columns even if they aren't in searchable_attributes. This makes it possible to do exact matches on storage_classes_*, which could be useful for degenerate cases like a single-element array or an empty properties object.

If this seems like the right behavior I'll need to update the API methods docs.

Actions #11

Updated by Ward Vandewege over 2 years ago

Tom Clegg wrote:

17994-filter-by-storage-classes @ be900941bb4ab286cbeb02f65509be938726d67e -- developer-run-tests: #2662

The developer-run-tests-apps-workbench-integration tests failed so I kicked those off again at developer-run-tests-apps-workbench-integration: #2823 /console. That failed again, so once more at developer-run-tests-apps-workbench-integration: #2824 /console, which finally passed.

LGTM, thanks!

Actions #12

Updated by Tom Clegg over 2 years ago

  • Status changed from In Progress to Resolved
Actions #13

Updated by Peter Amstutz over 2 years ago

  • Release set to 42
Actions

Also available in: Atom PDF