Project

General

Profile

Actions

Bug #7121

closed

[Keep] keepstore should use only one buffer for each PUT (and should not deadlock)

Added by Tom Clegg over 8 years ago. Updated over 8 years ago.

Status:
Resolved
Priority:
Normal
Assigned To:
Category:
Keep
Target version:
Story points:
1.0

Description

PUT uses 2x 64 MiB buffers if the block already exists on disk. One buffer is used to read the request body, the other is used to read the on-disk data into memory in order to compare it with the first buffer.

Aside from wasting memory, this can cause deadlock: if there are N bufferpool slots, and N concurrent PUT requests, each request gets one slot when it starts reading the request body, then all requests deadlock waiting for a "compare" buffer to become available.

We could fix this by adding a Compare() method to Volume (return true if content of file on disk is equal to this buf) instead of holding one buf of client-provided data, reading the data from disk into a second buffer, and comparing them.

It would be even better to do the compare operation while reading the request body from the client. However, this might get complicated: we'd need to read the data from disk at full speed (even if the client connection is slow) to avoid holding the spindle lock too long and shutting out other requests. I suggest we defer this aspect -- for now just fix the deadlock and 2x memory use.


Subtasks 4 (0 open4 closed)

Task #7202: Unit testsResolvedTom Clegg08/24/2015Actions
Task #7201: Add Compare methodResolvedTom Clegg08/24/2015Actions
Task #7217: Fix sleep() in test case by mocking mutexResolvedTom Clegg08/24/2015Actions
Task #7199: Review 7121-fix-deadlock, 94e1489ResolvedTom Clegg08/24/2015Actions

Related issues

Related to Arvados - Bug #6997: [Keep] keepstore reboots GCE host under heavy loadResolved08/16/2015Actions
Actions #1

Updated by Tom Clegg over 8 years ago

  • Description updated (diff)
Actions #2

Updated by Brett Smith over 8 years ago

  • Target version changed from Arvados Future Sprints to 2015-09-30 sprint
Actions #3

Updated by Brett Smith over 8 years ago

  • Target version changed from 2015-09-30 sprint to 2015-09-16 sprint
Actions #4

Updated by Tom Clegg over 8 years ago

  • Assigned To set to Tom Clegg
Actions #5

Updated by Tom Clegg over 8 years ago

7121-fix-deadlock at 94e1489

Actions #6

Updated by Ward Vandewege over 8 years ago

  • Status changed from New to In Progress
Actions #7

Updated by Tom Clegg over 8 years ago

Whoops, missed committing a new file. Now at d2db86b

Actions #8

Updated by Radhika Chippada over 8 years ago

  • log “collision” or “corrupt” errors so that we can have a log processor later on that looks for those and cleans up. (Can be in volume_unix.go)
  • volume_unix.go: This comment “Compare returns nil if Get(loc) would return the same content as …” : Do you mean “expect” instead of “cmp” at the beginning of second line of this comment? Even though you use “cmp” in the method implementation, I think we need to use the given input parameter name in the comment.
  • handlers.go: CompareAndTouch method — If any other errors other than collisionOrCorrupt, we are continuing and trying other volumes. Should we log and continue in the collisionOrCorrupt cases also?
  • There are still a few “serialize” references, especially in keepstore help message. Since you renamed “serialize” to “locker”, can you please update these remaining references as well?
  • Since the file mutex_test.go does not actually have any “tests” itself, it is a bit confusing. Can it be named “mutex_test_helper.go” instead? Or, better yet, add a test or two in it to make it complete.
  • volume.go : the comment for Get: Is the caller urged to put the returned slice into buffer pool to ensure cache is refreshed? if so, can you add that at the end of that comment.
  • volume.go : the comment for Compare is a bit confusing around “Confirm Get() would return buf. If so, return nil”. Can we say something like “Confirm Get(loc) would return a buf with same content as data …” ?
  • In collision_test.go, can you please add one more test with buf starting with anything other than "foo" resulting in DiskHashError?

Thanks.

Actions #9

Updated by Tom Clegg over 8 years ago

Radhika Chippada wrote:

  • log “collision” or “corrupt” errors so that we can have a log processor later on that looks for those and cleans up. (Can be in volume_unix.go)

Added logs in CompareAndTouch() -- this way Volume implementors can just return errors, and one set of "what's worth logging, and how" logic can live in the handlers.

  • volume_unix.go: This comment “Compare returns nil if Get(loc) would return the same content as …” : Do you mean “expect” instead of “cmp” at the beginning of second line of this comment? Even though you use “cmp” in the method implementation, I think we need to use the given input parameter name in the comment.

Yes, I forgot to update the comment when updating the signature. Fixed.

  • handlers.go: CompareAndTouch method — If any other errors other than collisionOrCorrupt, we are continuing and trying other volumes. Should we log and continue in the collisionOrCorrupt cases also?

In the "corrupt" case we log and continue; in the "collision" case we have to stop, to avoid getting into a state where two volumes have files with the same hash but different data. In such cases it's better to tell the current client "can't store that" (and continue giving the earlier clients the same data they wrote) than to tell the current client "OK" and start giving both old & new clients sometimes-old and sometimes-new data when they ask for it.

  • There are still a few “serialize” references, especially in keepstore help message. Since you renamed “serialize” to “locker”, can you please update these remaining references as well?

I think "-serialize" is still the correct term for the command line flag (and argument to TempUnixVolume). If there were other kinds of lockers to specify, we could ask the caller to say things like "-locker=mutex" on the command line, but I think -serialize is still suitable...

  • Since the file mutex_test.go does not actually have any “tests” itself, it is a bit confusing. Can it be named “mutex_test_helper.go” instead? Or, better yet, add a test or two in it to make it complete.

Fair enough. Changed to mock_mutex_for_test.go. (I want to make sure it's not accidentally exported, and *_test.go seems to be the only recognized pattern indicating to the Go tools "only used when running tests for this package".)

  • volume.go : the comment for Get: Is the caller urged to put the returned slice into buffer pool to ensure cache is refreshed? if so, can you add that at the end of that comment.

Added rationale: "(Otherwise, the buffer pool will be depleted and eventually -- when all available buffers are used and not returned -- operations will reach deadlock.)"

  • volume.go : the comment for Compare is a bit confusing around “Confirm Get() would return buf. If so, return nil”. Can we say something like “Confirm Get(loc) would return a buf with same content as data …” ?

Yes, makes sense. Updated.

  • In collision_test.go, can you please add one more test with buf starting with anything other than "foo" resulting in DiskHashError?

Ah, good call. Added three new tests with equal length but mismatched data in buf1, buf2, and rdr respectively.

Now at 23cc118

Actions #10

Updated by Radhika Chippada over 8 years ago

Thanks for clarifying and updating. LGTM.

Actions #11

Updated by Tom Clegg over 8 years ago

  • Status changed from In Progress to Resolved
  • % Done changed from 75 to 100

Applied in changeset arvados|commit:3400559165e50e3d62adf6d45f9970a13450d907.

Actions

Also available in: Atom PDF