Project

General

Profile

Actions

Bug #22052

closed

Exclude container_image column from container_requests trigram index

Added by Brett Smith 5 months ago. Updated 5 months ago.

Status:
Resolved
Priority:
Normal
Assigned To:
Category:
API
Story points:
-
Release:
Release relationship:
Auto

Description

This is a follow-up of #21815, which said:

Looking at this, I think it would be much better if all uuid fields and the portable data hash were excluded.

Tasks:

  1. exclude portable_data_hash and any field ending _uuid from full_text_searchable_columns

The container/request columns listed here all contain portable data hashes and should be excluded for the same reasons. However, nobody caught this during the work on #21815.

We should do this before the release of Arvados 3.0 because rebuilding these indexes can take a long time for large clusters. If we merge a fix now, we can have the new migration sort of "replace" the old one, and users only need to do it once. If we wait, and choose to fix this oversight later, they'll have to pay for the expensive migration twice.


Subtasks 1 (0 open1 closed)

Task #22054: Review 22052-container-columns-excluded-fulltextResolvedBrett Smith08/23/2024Actions

Related issues 1 (0 open1 closed)

Related to Arvados - Feature #21815: Exclude identifiers from trigram searchResolvedBrett Smith06/13/2024Actions
Actions #1

Updated by Brett Smith 5 months ago

  • Related to Feature #21815: Exclude identifiers from trigram search added
Actions #2

Updated by Brett Smith 5 months ago

  • Subject changed from Exclude container_image, log, output columns from full text search to Exclude container_image column from container_requests trigram index

We don't make a full text index on containers, so the log and output columns are non-issues. We just need to fix container_image on container_requests.

Actions #3

Updated by Peter Amstutz 5 months ago

  • Release set to 70

Yes, this is a good idea.

The main challenge is going to be fiddling with the previous migration so it doesn't have to re-index the database twice.

Changing the model can change the semantics of the migration retroactively (this has been a problem before) so that is something to watch out for.

Actions #4

Updated by Brett Smith 5 months ago

22052-container-columns-excluded-fulltext @ 47614a97f5e3d6ccdd7d13e8c43fb9ea9dbc3a30 - developer-run-tests: #4403 - The Workbench test failure seems to be unrelated, it is also affecting main: run-tests-workbench2: #447

  • All agreed upon points are implemented / addressed.
    • Yes
  • Anything not implemented (discovered or discussed during work) has a follow-up story.
    • N/A
  • Code is tested and passing, both automated and manual, what manual testing was done is described
    • Note that adding "container_image" to full_text_excluded_columns updates test "UUID and hash columns are excluded from #{model} full text index" in ArvadosModelTest. Confirmed by hand the test was failing post-update before I extended the regexp in ArvadosModel.full_text_searchable_columns.
  • Documentation has been updated.
    • N/A. There is already an upgrade note highlighting this migration. Since this is organized in a way that users will only run it once, there's nothing more to document.
  • Behaves appropriately at the intended scale (describe intended scale).
    • Organized the migration such that developers will get it but users will only need to rebuild the index once when they upgrade to Arvados 3.0.
  • Considered backwards and forwards compatibility issues between client and server.
    • No change in server/client compatibility
  • Follows our coding standards and GUI style guidelines.
    • Yes
Actions #5

Updated by Tom Clegg 5 months ago

LGTM, thanks. Tried the migration with and without the previous migration in place, both worked.

Actions #6

Updated by Brett Smith 5 months ago

  • Status changed from In Progress to Resolved
Actions

Also available in: Atom PDF