Project

General

Profile

Actions

Idea #14611

closed

[Epic] Site-wide search for text, filenames, data

Added by Tom Clegg over 5 years ago. Updated about 5 years ago.

Status:
Duplicate
Priority:
Normal
Assigned To:
-
Category:
-
Target version:
-
Story points:
-

Description

Arvados has had a "site-wide search" feature but it often fails to meet users' expectations.
  • Full-text search doesn't find exact strings (#13508) and doesn't index all filenames in large collections (#13752, #14560).
  • Substring search is slow, and doesn't index full rows (this is why full-text search was added).
  • No facility at all for searching file contents.

It is possible that we can use PostgreSQL's full-text search to address everything short of searching file contents, with a bit more work on our side (use a dictionary/language other than English, create a table of filenames instead of searching a huge text field with a list of filenames, etc.)

Another approach would be to use a separate tool to index/search the database, and apply Arvados permissions to those results. This could conceivably index file contents as well as database rows.


Related issues

Related to Arvados - Idea #13508: Fix postgres search for filenamesDuplicateActions
Related to Arvados - Bug #14560: [1.3.0] error: ERROR: string is too long for tsvector (2299194 bytes, max 1048575 bytes)ResolvedTom CleggActions
Related to Arvados - Bug #6382: [Workbench] Searching through a collection using regex should accept $ instead of \nClosed06/22/2015Actions
Is duplicate of Arvados - Feature #14573: [Spike] [API] Fully functional filename searchResolvedPeter AmstutzActions
Actions

Also available in: Atom PDF