Story #3705
[Keep] Generalize PullList module to BlockWorkList
100%
Description
- a current list
- a dedicated goroutine responsible for setting and getting the current list
- a block hash
- some additional data (sending/accepting the right type here is left up to the callers)
NewBlockWorkList()
- Returns a new BlockWorkList. Starts a new goroutine.
ReplaceList
- Writing a list to this channel replaces the current list, if any, with the supplied list. (The remainder of the old list is discarded. Any work still in progress on old list items is unaffected.)
NextItem
- Reading from this channel yields the next item from the list. (The returned item is also removed from the list.)
Close()
- Terminates the goroutine, closes the channels.
Pseudocode for a trash collector:
// Make a work list: trashList := NewBlockWorkList() // In an HTTP handler for "PUT /trash": { trashList.SetList(...) } // Start a worker: go func(list BlockWorkList) { for deleteRequest := range list.NextItem { // Check timestamp and delete the block. } }(trashList)
Subtasks
Associated revisions
3705: rename BlockWorkList -> WorkQueue
Per discussion at
https://github.com/curoverse/arvados/pull/8#discussion_r17637500
refs #3705.
Merge branch '3705-async-work-queue'
Refs #3705.
History
#1
Updated by Tom Clegg over 7 years ago
- Target version set to Arvados Future Sprints
#2
Updated by Ward Vandewege over 7 years ago
- Target version changed from Arvados Future Sprints to 2014-09-17 sprint
#3
Updated by Tom Clegg over 7 years ago
- Target version changed from 2014-09-17 sprint to Arvados Future Sprints
#4
Updated by Tom Clegg over 7 years ago
- Target version changed from Arvados Future Sprints to 2014-09-17 sprint
#5
Updated by Tim Pierce over 7 years ago
- Assigned To set to Tim Pierce
#6
Updated by Tim Pierce over 7 years ago
- Description updated (diff)
#7
Updated by Tim Pierce over 7 years ago
- Subject changed from [Keep] Generalize PullList module to BlockList - manage an ordered list of {blockhash,anything} tuples so it can be used by both "pull" and "trash" threads. to [Keep] Generalize PullList module to BlockManager
#8
Updated by Tim Pierce over 7 years ago
- Category set to Keep
#9
Updated by Tom Clegg over 7 years ago
- Subject changed from [Keep] Generalize PullList module to BlockManager to [Keep] Generalize PullList module to BlockWorkList
- Description updated (diff)
#10
Updated by Tom Clegg over 7 years ago
- Description updated (diff)
(Updated spec to communicate with channels instead of Get and Set methods)
#11
Updated by Tim Pierce over 7 years ago
- Story points changed from 1.0 to 2.0
#12
Updated by Tim Pierce over 7 years ago
- Status changed from New to In Progress
#13
Updated by Tom Clegg over 7 years ago
Comment in block_work_list.go:
- "Close() Shuts down the manager and the worker cleanly."
- I think the phrase "and the worker" should be removed?
- "The list manager continuously writes items to this channel."
- Perhaps better: "Reading from this channel yields the next item from the list. (The returned item is also removed from the list.) If the list is empty, the channel blocks until new items are available."
In tests:
- I think it would be worthwhile to test "Start worker; ReplaceList(1,2,3); wait until worker has done 3; ReplaceList(4,5,6); wait until worker has done 6" just to confirm that a worker can process the first list, block on the channel, and then resume work when the next list arrives. (Unless of course one of the existing tests does this and I missed it...)
(As discussed in chat) It seems a bit mysterious that we expose NextItem directly as a channel, but ReplaceList as a method whose sole purpose is to send something to a channel. I don't think it's a huge problem that we need to rush out and fix, but I wish I knew the rule of thumb for which is the better pattern.
#14
Updated by Tom Clegg over 7 years ago
Review discussion continues at https://github.com/curoverse/arvados/pull/8/files
#15
Updated by Radhika Chippada over 7 years ago
Tim,
I am reading through these changes as a learning exercise and had a few comments.
- BlockWorkList: "BlockListManager" seems a better name for this class / file?
- Is using camel case for variable names acceptable? For example, "newlist" at line 75? Also, elsewhere in the code you use names such as "current_list". Would it make sense to use either camel case or "_" consistently in our code and standardize on one of those formats? (I vote for camel case, as the go book seems to use it and more java'ish).
- Usages such as "b *BlockWorkList": Can they be "blockList *BlockWorkList" or something like that for a clearer naming?
- Should "handler_test.go" be "handlers_test.go? (missing s)
- I know the following comment is too much, but I could not help noticing that all files in the keepstore directory are in package "main" (same story with keepproxy). I wish we could have sub-directories in here which would serve as actual packages such as "volume", "handler" etc and had only the main store implementation in main package.
Something as follows might have been nicer:
arvados/services/keep/store
arvados/services/keep/store/handler
arvados/services/keep/store/volume
arvados/services/keep/proxy/
arvados/services/keep/proxy/xxx
Thanks.
#16
Updated by Tim Pierce over 7 years ago
- Status changed from In Progress to Resolved
Applied in changeset arvados|commit:4cfb296612f7b483b56c36f119ca175def706d2f.
Merge branch '3705-keep-blockworklist'
Closes #3705.