Concurrent writes to a single collection » History » Revision 2
« Previous |
Revision 2/6
(diff)
| Next »
Tom Clegg, 04/12/2024 03:10 PM
Concurrent writes to a single collection¶
Background:¶
Currently, if a client uses concurrent WebDAV PUT requests to write many files to a single collection, the resulting collection is guaranteed to contain all of the uploaded files only if all of the overlapping requests are handled by the same keep-web process. In a load-balancing scenario where some of the requests are handled by different keep-web processes, race conditions are not handled, so some of the uploaded files might not be preserved (despite successful response codes sent to the client).
Additionally, within a given keep-web process, overlapping write requests for a single collection are processed sequentially, which largely defeats the performance advantage of doing multi-threaded uploads.
Design goal:¶
When processing concurrent file uploads to a single collection via WebDAV, Arvados should:- Accept file data and write through to Keep concurrently, rather than sequentially.
- Ensure all uploaded files are preserved, even if upload requests are distributed across multiple keep-web, controller, and railsapi processes/hosts.
Proposal for concurrent upload handling:¶
When processing an upload, keep-web should:- Write the uploaded data to Keep using a temporary in-memory collection.
- Explicitly lock the target collection to prevent races with other keep-web processes/goroutines (see below).
- Get the target collection's current manifest, splice the uploaded file into it, and save it.
- Unlock the target collection.
Steps 2-4 could be done either by keep-web itself, or using the "replace_files" controller feature.
If "replace_files" is used, it will need two new features:- Ability to express "file X from a manifest supplied in this update request" (to avoid the overhead of creating and deleting a separate temporary collection just for the sake of referring to it in a "replace_files" request).
- Ability to express "file X from the current version of the target collection" (to avoid the overhead and race potential of retrieving the target collection's current PDH ahead of time just for the sake of referring to it in a "replace_files" request).
Proposal for locking collections:¶
Normal postgresql row locks ("select from collections for update"
) are not conducive to this situation because the relevant row needs to be updated by a Ruby program while the lock is held by a Go program.
However, if we create a separate table for the sole purpose of locking collections by UUID, Go programs can use row locks in that table to prevent overlapping update requests from getting through to RailsAPI.
For example, given the following setup:
create table uuid_locks (uuid varchar primary key, n integer);
The following SQL statements function as an exclusive lock:
begin; insert into uuid_locks (uuid) values ('zzzzz-tpzed-123451234512345') on conflict (uuid) do update set n=uuid_locks.n+1; -- (lock is held here) commit; -- (lock is released by commit, rollback, or disconnect)
The following SQL statement safely removes unused locks without blocking or deleting any in-use locks:
delete from uuid_locks where uuid in (select uuid from uuid_locks for update skip locked);
The table can be created with the "unlogged" option to improve performance. This accepts data loss on server crash, which is acceptable here because the table is only used for its locking semantics anyway. With or without "unlogged", if the above commit
("release lock") fails, then controller needs to assume the lock was not held long enough to protect the update from being overwritten by a different update, and return a 500 error to the caller.
Regardless of whether keep-web calls replace_files or implements locking itself, the "replace_files" feature should also use this locking mechanism. That way, any number of overlapping keep-web uploads and other uses of "replace_files" will be handled safely.
Once this feature is in place, arv-mount
and arvados-client mount
can, in principle, be updated to use "replace_files" to improve their behavior in concurrent write scenarios.
Updated by Tom Clegg 8 months ago · 6 revisions