Project

General

Profile

Actions

Idea #3705

closed

[Keep] Generalize PullList module to BlockWorkList

Added by Tom Clegg over 9 years ago. Updated over 9 years ago.

Status:
Resolved
Priority:
Normal
Assigned To:
Tim Pierce
Category:
Keep
Target version:
Start date:
08/21/2014
Due date:
Story points:
2.0

Description

The lists of "blocks to delete" and "blocks to pull from other keepstores" should be maintained in a BlockWorkList type. Just like the existing PullList, a BlockWorkList has
  • a current list
  • a dedicated goroutine responsible for setting and getting the current list
Each item in the list has
  • a block hash
  • some additional data (sending/accepting the right type here is left up to the callers)
The BlockWorkList interface provides these interfaces:
  • NewBlockWorkList()
    • Returns a new BlockWorkList. Starts a new goroutine.
  • ReplaceList
    • Writing a list to this channel replaces the current list, if any, with the supplied list. (The remainder of the old list is discarded. Any work still in progress on old list items is unaffected.)
  • NextItem
    • Reading from this channel yields the next item from the list. (The returned item is also removed from the list.)
  • Close()
    • Terminates the goroutine, closes the channels.

Pseudocode for a trash collector:

// Make a work list:
trashList := NewBlockWorkList()

// In an HTTP handler for "PUT /trash":
{
    trashList.SetList(...)
}

// Start a worker:
go func(list BlockWorkList) {
    for deleteRequest := range list.NextItem {
        // Check timestamp and delete the block.
    }
}(trashList)


Subtasks 3 (0 open3 closed)

Task #3858: Review 3705-keep-blockworklistResolvedTim Pierce08/21/2014Actions
Task #3652: Rewrite PullList as a BlockManager implementationResolvedTim Pierce08/21/2014Actions
Task #3752: Define BlockManager interfaces and typesResolvedTim Pierce08/21/2014Actions
Actions #1

Updated by Tom Clegg over 9 years ago

  • Target version set to Arvados Future Sprints
Actions #2

Updated by Ward Vandewege over 9 years ago

  • Target version changed from Arvados Future Sprints to 2014-09-17 sprint
Actions #3

Updated by Tom Clegg over 9 years ago

  • Target version changed from 2014-09-17 sprint to Arvados Future Sprints
Actions #4

Updated by Tom Clegg over 9 years ago

  • Target version changed from Arvados Future Sprints to 2014-09-17 sprint
Actions #5

Updated by Tim Pierce over 9 years ago

  • Assigned To set to Tim Pierce
Actions #6

Updated by Tim Pierce over 9 years ago

  • Description updated (diff)
Actions #7

Updated by Tim Pierce over 9 years ago

  • Subject changed from [Keep] Generalize PullList module to BlockList - manage an ordered list of {blockhash,anything} tuples so it can be used by both "pull" and "trash" threads. to [Keep] Generalize PullList module to BlockManager
Actions #8

Updated by Tim Pierce over 9 years ago

  • Category set to Keep
Actions #9

Updated by Tom Clegg over 9 years ago

  • Subject changed from [Keep] Generalize PullList module to BlockManager to [Keep] Generalize PullList module to BlockWorkList
  • Description updated (diff)
Actions #10

Updated by Tom Clegg over 9 years ago

  • Description updated (diff)

(Updated spec to communicate with channels instead of Get and Set methods)

Actions #11

Updated by Tim Pierce over 9 years ago

  • Story points changed from 1.0 to 2.0
Actions #12

Updated by Tim Pierce over 9 years ago

  • Status changed from New to In Progress
Actions #13

Updated by Tom Clegg over 9 years ago

Comment in block_work_list.go:

  • "Close() Shuts down the manager and the worker cleanly."
    • I think the phrase "and the worker" should be removed?
  • "The list manager continuously writes items to this channel."
    • Perhaps better: "Reading from this channel yields the next item from the list. (The returned item is also removed from the list.) If the list is empty, the channel blocks until new items are available."

In tests:

  • I think it would be worthwhile to test "Start worker; ReplaceList(1,2,3); wait until worker has done 3; ReplaceList(4,5,6); wait until worker has done 6" just to confirm that a worker can process the first list, block on the channel, and then resume work when the next list arrives. (Unless of course one of the existing tests does this and I missed it...)

(As discussed in chat) It seems a bit mysterious that we expose NextItem directly as a channel, but ReplaceList as a method whose sole purpose is to send something to a channel. I don't think it's a huge problem that we need to rush out and fix, but I wish I knew the rule of thumb for which is the better pattern.

Actions #14

Updated by Tom Clegg over 9 years ago

Actions #15

Updated by Radhika Chippada over 9 years ago

Tim,
I am reading through these changes as a learning exercise and had a few comments.

  • BlockWorkList: "BlockListManager" seems a better name for this class / file?
  • Is using camel case for variable names acceptable? For example, "newlist" at line 75? Also, elsewhere in the code you use names such as "current_list". Would it make sense to use either camel case or "_" consistently in our code and standardize on one of those formats? (I vote for camel case, as the go book seems to use it and more java'ish).
  • Usages such as "b *BlockWorkList": Can they be "blockList *BlockWorkList" or something like that for a clearer naming?
  • Should "handler_test.go" be "handlers_test.go? (missing s)
  • I know the following comment is too much, but I could not help noticing that all files in the keepstore directory are in package "main" (same story with keepproxy). I wish we could have sub-directories in here which would serve as actual packages such as "volume", "handler" etc and had only the main store implementation in main package.

Something as follows might have been nicer:

arvados/services/keep/store
arvados/services/keep/store/handler
arvados/services/keep/store/volume
arvados/services/keep/proxy/
arvados/services/keep/proxy/xxx

Thanks.

Actions #16

Updated by Tim Pierce over 9 years ago

  • Status changed from In Progress to Resolved

Applied in changeset arvados|commit:4cfb296612f7b483b56c36f119ca175def706d2f.

Actions

Also available in: Atom PDF