Idea #21452
closed
Option to apply arbitrary filters to arv-mount listings
Added by Brett Smith 11 months ago.
Updated 11 months ago.
Release relationship:
Auto
Description
Immediate use case: User wants arv-mount to not list intermediate and log collections, to match the way Workbench filters out these collections from project listings by default.
Provide a command line option that lets the user specify arbitrary filters for all project listings (and potentially other listings like searches by tag, although that could be a follow-up story). Include these filters whenever making a listing API call.
- At least accept JSON on the command line. Ideally, also accept a file path, and read JSON from there if given.
- Accept filters written in the groups.contents syntax, where subjects can be written with a type specifier like
groups.owner_uuid
or collections.properties
. The implementation will need to parse these out and pass appropriate filters to different API calls, because it does not actually call groups.contents, but accepting this complexity in the implementation is worth it to make life easier for users (and future us supporting them).
- Target version changed from Future to Development 2024-02-14 sprint
- Assigned To set to Brett Smith
- Description updated (diff)
- Subject changed from Option to filter collections (and other objects?) from arv-mount listings to Option to apply arbitrary filters to arv-mount listings
- Status changed from New to In Progress
21452-fuse-filters @ d704ced53ec06c1af67ef99bba6a20096056a67c - developer-run-tests: #4029 - The only failure is the CRAN download failures we've been seeing more of recently.
Since this is a big ball of new code, here's what you're looking at:
sdk/python/arvados/commands/_util.py
gets code to parse and validate JSON filters as discussed in the first bullet.
sdk/python/tests/test_cmd_util.py
gets new tests for that code.
services/fuse/arvados_fuse/command.py
adds the argument using that supporting code, and then passes the value to directories it constructs.
services/fuse/arvados_fuse/fusedir.py
has two types of changes: it also passes the filters value to superclasses and other new directories it constructs; and it adds the filters to its API queries. Honestly, it felt easier for me to add the filters everywhere, rather than try to spend the brain cells figuring out exactly which API calls "needed" them vs. not. Therefore this also provides the parenthetical about tag searching, etc.
services/fuse/tests/mount_test_base.py
gets updates to accommodate the new filters argument when building directories.
services/fuse/tests/test_mount_filters.py
has tests for the new functionality. It's a whole test matrix considering a variety of factors: which directory structure is tested (all changed classes are tested); test listing directories vs. checking files directly (the update
vs. __getitem__
methods in those classes); and a variety of filters. There are also end-to-end integration tests to test that filters propagate correctly, including one for the specific filters on collection properties that motivates this feature.
- New API test fixtures support the new FUSE tests.
Checklist:
- All agreed upon points are implemented / addressed.
- Anything not implemented (discovered or discussed during work) has a follow-up story.
- N/A, even did the stretch goals
- Code is tested and passing, both automated and manual, what manual testing was done is described
- Documentation has been updated.
- I added some
--help
documentation for the option. There doesn't seem to be any other good place to document this, unfortunately, which is rough because it could really use some example filters in a place more visible than the tests. The closest place is the CLI SDK reference, but it doesn't cover arv-mount at all since it's not callable from arv
. This should probably be a follow-up ticket, actually.
- Behaves appropriately at the intended scale (describe intended scale).
- Doesn't really change scale, just adds filters to existing API calls. If anything this is probably a scale improvement by giving users a way to slim down FUSE directories to make their sizes more manageable to other tools.
- Considered backwards and forwards compatibility issues between client and server.
- No changes to existing APIs.
- Follows our coding standards and GUI style guidelines.
Brett Smith wrote in #note-5:
21452-fuse-filters @ d704ced53ec06c1af67ef99bba6a20096056a67c - developer-run-tests: #4029 - The only failure is the CRAN download failures we've been seeing more of recently.
Since this is a big ball of new code, here's what you're looking at:
sdk/python/arvados/commands/_util.py
gets code to parse and validate JSON filters as discussed in the first bullet.
sdk/python/tests/test_cmd_util.py
gets new tests for that code.
services/fuse/arvados_fuse/command.py
adds the argument using that supporting code, and then passes the value to directories it constructs.
services/fuse/arvados_fuse/fusedir.py
has two types of changes: it also passes the filters value to superclasses and other new directories it constructs; and it adds the filters to its API queries. Honestly, it felt easier for me to add the filters everywhere, rather than try to spend the brain cells figuring out exactly which API calls "needed" them vs. not. Therefore this also provides the parenthetical about tag searching, etc.
services/fuse/tests/mount_test_base.py
gets updates to accommodate the new filters argument when building directories.
services/fuse/tests/test_mount_filters.py
has tests for the new functionality. It's a whole test matrix considering a variety of factors: which directory structure is tested (all changed classes are tested); test listing directories vs. checking files directly (the update
vs. __getitem__
methods in those classes); and a variety of filters. There are also end-to-end integration tests to test that filters propagate correctly, including one for the specific filters on collection properties that motivates this feature.
- New API test fixtures support the new FUSE tests.
Checklist:
- All agreed upon points are implemented / addressed.
- Anything not implemented (discovered or discussed during work) has a follow-up story.
- N/A, even did the stretch goals
- Code is tested and passing, both automated and manual, what manual testing was done is described
- Documentation has been updated.
- I added some
--help
documentation for the option. There doesn't seem to be any other good place to document this, unfortunately, which is rough because it could really use some example filters in a place more visible than the tests. The closest place is the CLI SDK reference, but it doesn't cover arv-mount at all since it's not callable from arv
. This should probably be a follow-up ticket, actually.
- Behaves appropriately at the intended scale (describe intended scale).
- Doesn't really change scale, just adds filters to existing API calls. If anything this is probably a scale improvement by giving users a way to slim down FUSE directories to make their sizes more manageable to other tools.
- Considered backwards and forwards compatibility issues between client and server.
- No changes to existing APIs.
- Follows our coding standards and GUI style guidelines.
code LGTM
Regarding documentation, any reason not to document the filters option on https://doc.arvados.org/v2.7/user/tutorials/tutorial-keep-mount-gnu-linux.html ? I know it's sort of labeled "tutorial" but it's the effectively the main documentation page for this tool?
That page is linked at the bottom of https://doc.arvados.org/v2.7/sdk/python/arvados-fuse.html but it isn't very obvious and that would benefit from making the link more prominent.
Peter Amstutz wrote in #note-6:
Regarding documentation, any reason not to document the filters option on https://doc.arvados.org/v2.7/user/tutorials/tutorial-keep-mount-gnu-linux.html ? I know it's sort of labeled "tutorial" but it's the effectively the main documentation page for this tool?
That page is linked at the bottom of https://doc.arvados.org/v2.7/sdk/python/arvados-fuse.html but it isn't very obvious and that would benefit from making the link more prominent.
Per discussion in Matrix and standup: IMO that page isn't "sort of labeled 'tutorial,'" it is clearly part of the user guide that's meant to be read more or less linearly (skipping sections you don't care about). This is the first real section of the guide, talking about data management before we get into any detail about what the Arvados API is or what you can do with it. At this point the reader has no frame of reference to understand the --filters
option, and trying to shoehorn it in here any way would be very confusing and make our user guide worse.
Made a follow-up story #21504. Peter made the good point that it would be good to have text somewhere that we can put in a better-organized place later. I will add a --filters
reference blurb in a comment there while it's fresh in my head.
- Status changed from In Progress to Resolved
Also available in: Atom
PDF