Story #15781

[WB2] Support multiple values for a given property key

Added by Tom Morris 11 months ago. Updated 7 months ago.

Status:
Resolved
Priority:
Normal
Assigned To:
Category:
Workbench2
Start date:
01/30/2020
Due date:
% Done:

100%

Estimated time:
(Total: 0.00 h)
Story points:
4.0
Release relationship:
Auto

Description

Backend

https://www.postgresql.org/docs/9.5/functions-json.html

arvados_development=# select '{"a":["foo","bar"]}'::jsonb @> '{"a":["bar"]}'::jsonb;
 ?column? 
----------
 t

So jsonb subset match works if you know the key and are looking for an exact match on the value.

It looks like an Arvados filter of [["properties.a", "=", ["bar"]] might already work by since it uses @> in the implementation. Although that's arguably a bug since the expectation is to get an exact match. Instead we'll introduce a new operator [["properties.a", "contains", "bar"].

Frontend UI

For editing, rather than overwriting a tag:value pair as is done currently when the tag is specified multiple times, add the new tag:value pair to the list. When writing to the backend, merge the values into an array which becomes value that's written into the properties. Arrays are only used when there are multiple values.

For searching the implementation will use the new contains operator internally, but continue to use the "has" notation externally.


Subtasks

Arvados - Task #16030: Review 15781-multi-value-property-search (backend, arvados repo)ResolvedPeter Amstutz

Task #16112: Review 15781-multi-value-property-edit (wb2 repo)ResolvedPeter Amstutz


Related issues

Has duplicate Arvados Workbench 2 - Bug #15774: Advanced Search editor keeps adding search terms instead of replacingDuplicate

Associated revisions

Revision 64e387b2
Added by Lucas Di Pentima 7 months ago

Merge branch '15781-multi-value-property-search'

Refs #15781

Revision dd315a23
Added by Lucas Di Pentima 7 months ago

Merge branch '15781-multi-value-property-edit'

Refs #15781

History

#1 Updated by Peter Amstutz 11 months ago

  • Description updated (diff)

#2 Updated by Peter Amstutz 11 months ago

  • Description updated (diff)

#3 Updated by Peter Amstutz 11 months ago

  • Description updated (diff)

#4 Updated by Peter Amstutz 11 months ago

  • Description updated (diff)

#5 Updated by Tom Morris 11 months ago

  • Description updated (diff)
  • Target version changed from To Be Groomed to Arvados Future Sprints
  • Story points set to 4.0

#6 Updated by Peter Amstutz 9 months ago

  • Release set to 20
  • Target version deleted (Arvados Future Sprints)

#7 Updated by Lucas Di Pentima 9 months ago

  • Target version set to 2020-01-29 Sprint
  • Assigned To set to Lucas Di Pentima
  • Category set to Workbench2

#8 Updated by Lucas Di Pentima 8 months ago

  • Status changed from New to In Progress

#9 Updated by Lucas Di Pentima 8 months ago

  • Target version changed from 2020-01-29 Sprint to 2020-02-12 Sprint

#10 Updated by Lucas Di Pentima 8 months ago

Updates at commit:8a88ceb33 - branch 15781-multi-value-property-search (arvados repo)
Test run: https://ci.arvados.org/job/developer-run-tests/1711/

  • Adds contains subproperty filter to the API server
  • Adds tests checking that the new operator works on scalar and array subproperty values and also that the = filter doesn't work on array subproperty elements.
  • Adds documentation update.

#11 Updated by Lucas Di Pentima 8 months ago

Updates for workbench2 at commit 41658ace - branch 15781-multi-value-property-edit

  • Adds support for multi-value properties editing on collections and projects
  • Adds support for multi-value properties on the advanced search editor
  • Uses 'contains' filter when searching for properties

#12 Updated by Lucas Di Pentima 8 months ago

  • Has duplicate Bug #15774: Advanced Search editor keeps adding search terms instead of replacing added

#13 Updated by Peter Amstutz 8 months ago

Lucas Di Pentima wrote:

Updates at commit:8a88ceb33 - branch 15781-multi-value-property-search (arvados repo)
Test run: https://ci.arvados.org/job/developer-run-tests/1711/

  • Adds contains subproperty filter to the API server
  • Adds tests checking that the new operator works on scalar and array subproperty values and also that the = filter doesn't work on array subproperty elements.
  • Adds documentation update.

The documentation for "contains" operation is not very clear. Suggest something like:

Filter where subproperty has a value either by exact match or value is element of subproperty list. For example, ["foo", "containes", "bar"] will find both {"foo": "bar"} and {"foo": ["bar", "baz"]}.

#14 Updated by Lucas Di Pentima 8 months ago

Peter Amstutz wrote:

Lucas Di Pentima wrote:

Updates at commit:8a88ceb33 - branch 15781-multi-value-property-search (arvados repo)

[...]

The documentation for "contains" operation is not very clear. Suggest something like:

Filter where subproperty has a value either by exact match or value is element of subproperty list. For example, ["foo", "containes", "bar"] will find both {"foo": "bar"} and {"foo": ["bar", "baz"]}.

Fixed at commit:dd7da09fe

#15 Updated by Peter Amstutz 8 months ago

Lucas Di Pentima wrote:

Updates for workbench2 at commit 41658ace - branch 15781-multi-value-property-edit

  • Adds support for multi-value properties editing on collections and projects
  • Adds support for multi-value properties on the advanced search editor
  • Uses 'contains' filter when searching for properties

In the advanced search popup, removing property filters doesn't remove them from the query bar.

If you attempt delete a property and the update fails, it doesn't restore the property in the display.

If you do a federated search on clusters that don't support 'contains' you get back errors. Should increment to increment the API version and test for it, and decide if client is able to use 'contains' or needs to fall back on '=' or whatever it used before.

#16 Updated by Lucas Di Pentima 8 months ago

  • Target version changed from 2020-02-12 Sprint to 2020-02-26 Sprint

#17 Updated by Lucas Di Pentima 8 months ago

Update at commit:8141f4c40 - branch 15781-multi-value-property-search (arvados repo)

  • Bumps the API revision number to indicate the new contains filter operator to clients.

#18 Updated by Lucas Di Pentima 8 months ago

Update at commit ecd7fabb - branch 15781-multi-value-property-edit (wb2 repo)
Test run: https://ci.arvados.org/view/Developer/job/developer-tests-workbench2/12/

  • Adds API revision number (from discovery document) to sessions, defaulting to zero.
  • Uses 'contains' or 'ilike' when searching properties depending on API revision number.
  • Fixes error handling on properties add/remove operations.
  • Fixes property removal on advanced search editor.
  • Fixes property Chip UI duplication on advanced search editor.

#19 Updated by Peter Amstutz 7 months ago

Lucas Di Pentima wrote:

Update at commit ecd7fabb - branch 15781-multi-value-property-edit (wb2 repo)
Test run: https://ci.arvados.org/view/Developer/job/developer-tests-workbench2/12/

  • Uses 'contains' or 'ilike' when searching properties depending on API revision number.

Will 'contains' ignore case? Or should it fall back to 'like' instead of 'ilike'?

#20 Updated by Lucas Di Pentima 7 months ago

Updates at commit:7a7447ac7
Test run: https://ci.arvados.org/job/developer-run-tests-services-api/1779/

As workbench2 was previously using 'ilike', I preferred to give 'contains' case-insensitive matching powers. Added a couple of tests for this.

#21 Updated by Lucas Di Pentima 7 months ago

I've run PG's EXPLAIN on both types of queries, and it seems that my solution isn't the best as it forces a sequential scan:

Before

arvados_test=# explain select uuid from collections where collections.properties @> '{"listprop":"ELEM6"}'::jsonb;
                                          QUERY PLAN                                          
----------------------------------------------------------------------------------------------
 Bitmap Heap Scan on collections  (cost=12.00..16.01 rows=1 width=27)
   Recheck Cond: (properties @> '{"listprop": "ELEM6"}'::jsonb)
   ->  Bitmap Index Scan on collection_index_on_properties  (cost=0.00..12.00 rows=1 width=0)
         Index Cond: (properties @> '{"listprop": "ELEM6"}'::jsonb)

After

arvados_test=# explain select uuid from collections where lower(collections.properties::text)::jsonb @> lower('{"listprop":"ELEM6"}')::jsonb;
                                    QUERY PLAN                                    
----------------------------------------------------------------------------------
 Seq Scan on collections  (cost=0.00..42.15 rows=1 width=27)
   Filter: ((lower((properties)::text))::jsonb @> '{"listprop": "elem6"}'::jsonb)

Maybe there's an index we can add? I'll look into it. The plan B would be to make wb2 use like as a fallback, but that would change previous behavior that users may already be using. This might not be too disruptive as wb2 is still in beta.

#22 Updated by Lucas Di Pentima 7 months ago

Adding an expression GIN index makes the difference:

arvados_test=# create index coll_props_lower_2 on collections using gin ((lower(properties::text)::jsonb));
CREATE INDEX
arvados_test=# explain select uuid from collections where (lower(collections.properties::text)::jsonb) @> lower('{"listprop":"ELEM6"}')::jsonb;
                                         QUERY PLAN                                         
--------------------------------------------------------------------------------------------
 Bitmap Heap Scan on collections  (cost=12.00..16.03 rows=1 width=27)
   Recheck Cond: ((lower((properties)::text))::jsonb @> '{"listprop": "elem6"}'::jsonb)
   ->  Bitmap Index Scan on coll_props_lower_2  (cost=0.00..12.00 rows=1 width=0)
         Index Cond: ((lower((properties)::text))::jsonb @> '{"listprop": "elem6"}'::jsonb)
(4 rows)

#23 Updated by Lucas Di Pentima 7 months ago

After talking about the case [in]sensitive search topic with Peter, we concluded that for now, workbench2 will support case-sensitive search on properties to avoid having a second index.

Just in case we need to make the contains filter operator to do case-insensitive matches, here's the diff:

diff --git a/services/api/lib/record_filters.rb b/services/api/lib/record_filters.rb
index 5688ca614..b3bf055d6 100644
--- a/services/api/lib/record_filters.rb
+++ b/services/api/lib/record_filters.rb
@@ -141,7 +141,7 @@ module RecordFilters
             end
             param_out << proppath
           when 'contains'
-            cond_out << "#{attr_table_name}.#{attr} @> ?::jsonb OR #{attr_table_name}.#{attr} @> ?::jsonb" 
+            cond_out << "lower(#{attr_table_name}.#{attr}::text)::jsonb @> lower(?)::jsonb OR lower(#{attr_table_name}.#{attr}::text)::jsonb @> lower(?)::jsonb" 
             param_out << SafeJSON.dump({proppath => operand})
             param_out << SafeJSON.dump({proppath => [operand]})
           else

#24 Updated by Lucas Di Pentima 7 months ago

Updates at commit:1d922bf21 undo the last 2 commits, and just adds a couple of tests to confirm that contains does case-sensitive matching only.
Test run: https://ci.arvados.org/job/developer-run-tests-services-api/1784/

#25 Updated by Lucas Di Pentima 7 months ago

Updates at commit e8d27222 (wb2 repo) makes the search fallback to use like instead of ilike for property matching.
Test run: https://ci.arvados.org/view/Developer/job/developer-tests-workbench2/13/

#26 Updated by Peter Amstutz 7 months ago

If we're just doing exact matches, shouldn't the fallback query be

["groups.properties.IDTAGANIMALS","=","IDVALANIMALS2"]

not

["groups.properties.IDTAGANIMALS","like","%IDVALANIMALS2%"]

Doesn't using "like" force a sequential scan?

#27 Updated by Peter Amstutz 7 months ago

 Bitmap Heap Scan on collections  (cost=164.25..282.31 rows=32 width=87) (actual time=0.614..0.615 rows=1 loops=1)
   Recheck Cond: (properties @> '{"IDTAGANIMALS": "IDVALANIMALS2"}'::jsonb)
   Heap Blocks: exact=1
   ->  Bitmap Index Scan on collection_index_on_properties  (cost=0.00..164.24 rows=32 width=0) (actual time=0.607..0.607 rows=1 loops=1)
         Index Cond: (properties @> '{"IDTAGANIMALS": "IDVALANIMALS2"}'::jsonb)
 Planning time: 0.137 ms
 Execution time: 0.675 ms
 Seq Scan on collections  (cost=0.00..3241.22 rows=3 width=87) (actual time=9.591..11.891 rows=1 loops=1)
   Filter: ((properties ->> 'IDTAGANIMALS'::text) ~~ '%IDVALANIMALS2%'::text)
   Rows Removed by Filter: 32114
 Planning time: 0.132 ms
 Execution time: 11.910 ms

#28 Updated by Peter Amstutz 7 months ago

 Bitmap Heap Scan on collections  (cost=328.52..559.55 rows=65 width=87) (actual time=1.135..1.135 rows=1 loops=1)
   Recheck Cond: ((properties @> '{"IDTAGANIMALS": "IDVALANIMALS2"}'::jsonb) OR (properties @> '{"IDTAGANIMALS": ["IDVALANIMALS2"]}'::jsonb))
   Heap Blocks: exact=1
   ->  BitmapOr  (cost=328.52..328.52 rows=65 width=0) (actual time=1.125..1.125 rows=0 loops=1)
         ->  Bitmap Index Scan on collection_index_on_properties  (cost=0.00..164.24 rows=32 width=0) (actual time=0.569..0.569 rows=1 loops=1)
               Index Cond: (properties @> '{"IDTAGANIMALS": "IDVALANIMALS2"}'::jsonb)
         ->  Bitmap Index Scan on collection_index_on_properties  (cost=0.00..164.24 rows=32 width=0) (actual time=0.555..0.555 rows=0 loops=1)
               Index Cond: (properties @> '{"IDTAGANIMALS": ["IDVALANIMALS2"]}'::jsonb)
 Planning time: 0.165 ms
 Execution time: 1.182 ms

#29 Updated by Peter Amstutz 7 months ago

The wb2 filter builder automatically adds leading and trailing '%' wildcard with 'addLike'.

#30 Updated by Peter Amstutz 7 months ago

Container requests also have jsonb properties, that should be included in queryToFilters.

#31 Updated by Lucas Di Pentima 7 months ago

Above comments fixed at commit 5b0a7e37. Thanks!

#32 Updated by Peter Amstutz 7 months ago

Peter Amstutz wrote:

The wb2 filter builder automatically adds leading and trailing '%' wildcard with 'addLike'.

I realize it's not clear, that's just a note, not necessarily something to fix, although it is a little bit surprising.

#33 Updated by Peter Amstutz 7 months ago

LGTM now, thanks!

#34 Updated by Lucas Di Pentima 7 months ago

  • Status changed from In Progress to Resolved

Also available in: Atom PDF