Feature #15429

[Workbench] Use trigram based full text search.

Added by Eric Biagiotti 5 months ago. Updated 29 days ago.

Status:
Resolved
Priority:
Normal
Assigned To:
Category:
-
Target version:
Start date:
10/15/2019
Due date:
% Done:

100%

Estimated time:
(Total: 0.00 h)
Story points:
0.5
Release relationship:
Auto

Description

Since we plan on deprecating @@, as an available API list method filter, we should switch Workbench 1 full text search to use the trigram index based ["any", "ilike", "%%"] filter.


Subtasks

Task #15620: Review 15429-wb1-use-trigram-searchResolvedLucas Di Pentima


Related issues

Related to Arvados - Story #15430: [API] Remove the @@ list method filterNew

Associated revisions

Revision beb4aebe
Added by Lucas Di Pentima about 1 month ago

Merge branch '15429-wb1-use-trigram-search'
Closes #15429

History

#1 Updated by Eric Biagiotti 5 months ago

  • Related to Story #15430: [API] Remove the @@ list method filter added

#2 Updated by Tom Morris 4 months ago

  • Target version set to To Be Groomed

#3 Updated by Tom Morris 3 months ago

  • Story points set to 0.5

#4 Updated by Tom Morris 3 months ago

  • Target version changed from To Be Groomed to Arvados Future Sprints

#5 Updated by Tom Morris 2 months ago

  • Target version changed from Arvados Future Sprints to 2019-09-25 Sprint

#6 Updated by Lucas Di Pentima 2 months ago

  • Assigned To set to Lucas Di Pentima

#7 Updated by Lucas Di Pentima about 2 months ago

  • Target version changed from 2019-09-25 Sprint to 2019-10-09 Sprint

#8 Updated by Tom Morris about 1 month ago

  • Target version changed from 2019-10-09 Sprint to 2019-10-23 Sprint

#9 Updated by Lucas Di Pentima about 1 month ago

  • Status changed from New to In Progress

#10 Updated by Lucas Di Pentima about 1 month ago

  • Release set to 22

#11 Updated by Lucas Di Pentima about 1 month ago

Updates at b93675913 - branch 15429-wb1-use-trigram-search
Test run: https://ci.curoverse.com/job/developer-run-tests/1590/

  • Replaced to_tsquery() with to_tsquery_filters() that adds a ["any", "ilike", "%term%"] for every "word" on the user search query.
  • No test updates needed as integration tests don't care how the search is done.

Some clarification needed: On to_tsquery()'s comment there were some cases documented stating that underscore chars should be treated as non-word, but from my manual tests it seems that isn't the case with the current regular expression. Not sure if the comments were old or the code buggy and I should add that feature to the new function.

#12 Updated by Tom Clegg about 1 month ago

Since it's no longer doing tsquery things, to_tsquery_filters() doesn't seem like the best name -- maybe ilike_filters()?

""." is a word char in FT queries" → "...ilike queries"

comment "It returns null if..." → "It returns [] if..." (good change!)

to_tsquery_filters(q).forEach(function(f) { filters.push(f) }) ...could be just filters = filters.concat(to_tsquery_filters(q)) like the line above

#13 Updated by Lucas Di Pentima about 1 month ago

Updates at 13ac0cadc
Test run: https://ci.curoverse.com/job/developer-run-tests/1593/

Addressed the above comments and also renamed the to_tsquery.js file to ilike_filters.js

#15 Updated by Tom Clegg about 1 month ago

Lucas Di Pentima wrote:

Some clarification needed: On to_tsquery()'s comment there were some cases documented stating that underscore chars should be treated as non-word, but from my manual tests it seems that isn't the case with the current regular expression. Not sure if the comments were old or the code buggy and I should add that feature to the new function.

I don't think it matters: the purpose of defining "word chars" in the previous code was to match the postgres full-text query behavior. The new code just has to behave sensibly with "ilike". I suppose we can either treat an underscore as a word char and leave it alone (so it matches any single char) or escape it with a backslash (so it matches a literal underscore). I'd guess that most users will expect it to match a literal underscore, but the wildcard option also seems reasonable.

#16 Updated by Tom Clegg about 1 month ago

LGTM @ 13ac0cadc

#18 Updated by Tom Clegg about 1 month ago

LGTM, thanks

#19 Updated by Lucas Di Pentima about 1 month ago

  • Status changed from In Progress to Resolved

Also available in: Atom PDF