Project

General

Profile

Actions

Bug #21205

closed

ensure_unique_name logic is not robust enough

Added by Peter Amstutz about 1 year ago. Updated about 1 year ago.

Status:
Resolved
Priority:
Normal
Assigned To:
Category:
API
Story points:
-
Release relationship:
Auto

Description

User reported "duplicate key value violates unique constraint index_collections_on_owner_and_name".

This error occurred from arvados-cwl-runner creating a collection called "Intermediate collection".

This case does supply the ensure_unique_name flag.

However, it seems that between running 100s of workflows, and this particular step doing a 24-way scatter, that it attempting to create ~10 collections per second (see screenshot), and that, while < 1% chance, it is evidently possible for two or more collections to be created at the same time and for save_with_unique_name! to fail enough times in a row that one of those collection create operations fails with a uniqueness error.

From discussion:

Instead of adding a timestamp, we should add ~10 random alphanumeric digits (similar to the logic used for UUIDs). As demonstrated, timestamps are prone to collision, and using millisecond precision that means we are limited to creating 1000 collections per second.


Files


Subtasks 1 (0 open1 closed)

Task #21208: Review 21205-ensure-uniqueResolvedPeter Amstutz11/29/2023Actions
Actions #1

Updated by Peter Amstutz about 1 year ago

  • Status changed from New to In Progress
Actions #2

Updated by Peter Amstutz about 1 year ago

  • Status changed from In Progress to New
  • Category set to API
Actions #4

Updated by Peter Amstutz about 1 year ago

  • Description updated (diff)
Actions #5

Updated by Peter Amstutz about 1 year ago

  • Assigned To set to Peter Amstutz
Actions #6

Updated by Peter Amstutz about 1 year ago

  • Target version changed from Development 2023-11-29 sprint to Development 2024-01-03 sprint
Actions #8

Updated by Peter Amstutz about 1 year ago

  • Status changed from New to In Progress
Actions #9

Updated by Peter Amstutz about 1 year ago

21205-ensure-unique @ c3872e7a1c817cd39b702f694f70d34f28f7f472

developer-run-tests: #3929

  • All agreed upon points are implemented / addressed.
    • uses random characters instead of a timestamp to make the names unique
  • Anything not implemented (discovered or discussed during work) has a follow-up story.
    • n/a
  • Code is tested and passing, both automated and manual, what manual testing was done is described
    • existing tests passing
  • Documentation has been updated.
    • n/a
  • Behaves appropriately at the intended scale (describe intended scale).
    • should help at large scale as timestamp collisions are no longer possible
  • Considered backwards and forwards compatibility issues between client and server.
    • the number of characters replaced at the end are less than before, so the arvados.util.trim_name method is still valid (it just trims a bit more than it needs to).
  • Follows our coding standards and GUI style guidelines.
    • yes

When creating a new record, this should always succeed -- in the unlikely event that a record already existed with the same uuid, it would roll a new uuid.

When un-trashing a record, it could still fail if there was another record that already had the same name as what the ensure_unique_name logic assigns (this would involve adding the last 15 characters of uuid for the record that is going to be un-trashed). However, I cannot think of any normal usage scenario where this would come up, it would only happen under a contrived circumstance designed to cause a collision.

Actions #10

Updated by Peter Amstutz about 1 year ago

  • Release set to 67
Actions #12

Updated by Lucas Di Pentima about 1 year ago

  • File services/api/app/models/arvados_model.rb
    • Line 506: "the uuid caused a name collision" comment makes me think that save_with_unique_name! is intended to include the uuid's last chars on the name by default and not as a fallback behavior, but add_uuid_to_make_unique is only called on the fallback code block, so not sure which is the valid one. Personally I think that adding the last UUID chars by default is the best option for performance reasons.
    • If the above was indeed meant to include the UUID chars by default, I think we need a specific test exercising that new behavior.
  • File services/api/test/unit/container_request_test.rb
    • Line 1134: "...plus the date" comment is now outdated

The rest LGTM.

Actions #13

Updated by Peter Amstutz about 1 year ago

Lucas Di Pentima wrote in #note-12:

  • File services/api/app/models/arvados_model.rb
    • Line 506: "the uuid caused a name collision" comment makes me think that save_with_unique_name! is intended to include the uuid's last chars on the name by default and not as a fallback behavior, but add_uuid_to_make_unique is only called on the fallback code block, so not sure which is the valid one. Personally I think that adding the last UUID chars by default is the best option for performance reasons.
    • If the above was indeed meant to include the UUID chars by default, I think we need a specific test exercising that new behavior.

The code retains the preexisting behavior, which only changes the name when there is a collision. You're right, in situations where we're creating a lot of collections using the same base name, it would avoid going through the fail-and-retry every time.

On the other hand, in situations where a collision is unlikely (but we want to ensure the collection creation succeeds even when there is a collision), it would still always add extra text to the end of the name, which might not be desirable. It is a tradeoff. One solution would be to add a new option with the behavior that it always adds the extra text -- but at the moment we don't have any indication that the fail-and-retry logic is actually a performance problem.

  • File services/api/test/unit/container_request_test.rb
    • Line 1134: "...plus the date" comment is now outdated

Fixed.

The rest LGTM.

Thanks.

Actions #14

Updated by Peter Amstutz about 1 year ago

  • Status changed from In Progress to Resolved
Actions

Also available in: Atom PDF