Bug #4523

Updated by Tom Clegg almost 7 years ago

To reproduce:

Go to https://workbench.qr1hi.arvadosapi.com/, enter any term into the search box (I used "hash") and click the search button. The search modal box will appear and display a spinning wheel. After some time it issues "Oops, request failed."

Proposed fix:
* Add database indexes to all tables:
** one on owner_uuid
** one on the full set of searchable columns (i.e., @ModelClass.searchable_columns("<") | ModelClass.searchable_columns("like")@)
** (phase2) each searchable column
* To avoid undetected additions of columns without indexes in the future, add a unit test that iterates over all model classes and tests:
** there is a multi-column index on the full set of searchable columns
** there is an index on owner_uuid and an index on uuid
** (phase2) there is an index on each searchable column

Fix phase 3 (see subtask #4900): We want "search for filename" to be fast, but manifest_text is too big to index. Therefore we want to add a new column of type string with file names only, truncated to the allowed size.
* Add a file_names column to collections table (@:string, length: 2**16@).
* Add a before_save hook to update the file_names attribute whenever manifest_text changes. If a manifest has more than 64KiB worth of filenames, just truncate the file_names attribute. (Saving the manifest will still work, but some filenames won't be searchable.)
** Example manifest_text: @". d41d8[...] 0:0:file1.txt 0:0:file2.txt\n"@
** Example file_names: @"file1.txt\nfile2.txt\n"@
* Use file_names instead of manifest_text when when searching for "any" in collections table.
* Omit text columns from searchable_columns.
* Add tests:
** file_names content is correct after create, and after update.
**
Searching by filename works.
** Using Storing a manifest with 1MiB worth of filenames (which works. This can be built reasonably quickly like this) this:
***
<pre>. d41d8[...] 0:0:longlonglongfilename0000001 0:0:longlonglongfilename0000002 ...\n</pre>
*** Creating a collection with this manifest does not throw an error
*** A search for longlonglongfilename0000001 includes the collection that was just created
** The collection API response (create/update/get/index) does not provide a file_names attribute.

Back