Story #3705

[Keep] Generalize PullList module to BlockWorkList

Added by Tom Clegg over 6 years ago. Updated over 6 years ago.

Status:
Resolved
Priority:
Normal
Assigned To:
Tim Pierce
Category:
Keep
Target version:
Start date:
08/21/2014
Due date:
% Done:

100%

Estimated time:
(Total: 0.00 h)
Story points:
2.0

Description

The lists of "blocks to delete" and "blocks to pull from other keepstores" should be maintained in a BlockWorkList type. Just like the existing PullList, a BlockWorkList has
  • a current list
  • a dedicated goroutine responsible for setting and getting the current list
Each item in the list has
  • a block hash
  • some additional data (sending/accepting the right type here is left up to the callers)
The BlockWorkList interface provides these interfaces:
  • NewBlockWorkList()
    • Returns a new BlockWorkList. Starts a new goroutine.
  • ReplaceList
    • Writing a list to this channel replaces the current list, if any, with the supplied list. (The remainder of the old list is discarded. Any work still in progress on old list items is unaffected.)
  • NextItem
    • Reading from this channel yields the next item from the list. (The returned item is also removed from the list.)
  • Close()
    • Terminates the goroutine, closes the channels.

Pseudocode for a trash collector:

// Make a work list:
trashList := NewBlockWorkList()

// In an HTTP handler for "PUT /trash":
{
    trashList.SetList(...)
}

// Start a worker:
go func(list BlockWorkList) {
    for deleteRequest := range list.NextItem {
        // Check timestamp and delete the block.
    }
}(trashList)


Subtasks

Task #3858: Review 3705-keep-blockworklistResolvedTim Pierce

Task #3652: Rewrite PullList as a BlockManager implementationResolvedTim Pierce

Task #3752: Define BlockManager interfaces and typesResolvedTim Pierce

Associated revisions

Revision 4cfb2966
Added by Tim Pierce over 6 years ago

Merge branch '3705-keep-blockworklist'

Closes #3705.

Revision 5f4f8509 (diff)
Added by Tim Pierce over 6 years ago

3705: rename BlockWorkList -> WorkQueue

Per discussion at
https://github.com/curoverse/arvados/pull/8#discussion_r17637500

refs #3705.

Revision bd0d76fa
Added by Tim Pierce over 6 years ago

Merge branch '3705-async-work-queue'

Refs #3705.

History

#1 Updated by Tom Clegg over 6 years ago

  • Target version set to Arvados Future Sprints

#2 Updated by Ward Vandewege over 6 years ago

  • Target version changed from Arvados Future Sprints to 2014-09-17 sprint

#3 Updated by Tom Clegg over 6 years ago

  • Target version changed from 2014-09-17 sprint to Arvados Future Sprints

#4 Updated by Tom Clegg over 6 years ago

  • Target version changed from Arvados Future Sprints to 2014-09-17 sprint

#5 Updated by Tim Pierce over 6 years ago

  • Assigned To set to Tim Pierce

#6 Updated by Tim Pierce over 6 years ago

  • Description updated (diff)

#7 Updated by Tim Pierce over 6 years ago

  • Subject changed from [Keep] Generalize PullList module to BlockList - manage an ordered list of {blockhash,anything} tuples so it can be used by both "pull" and "trash" threads. to [Keep] Generalize PullList module to BlockManager

#8 Updated by Tim Pierce over 6 years ago

  • Category set to Keep

#9 Updated by Tom Clegg over 6 years ago

  • Subject changed from [Keep] Generalize PullList module to BlockManager to [Keep] Generalize PullList module to BlockWorkList
  • Description updated (diff)

#10 Updated by Tom Clegg over 6 years ago

  • Description updated (diff)

(Updated spec to communicate with channels instead of Get and Set methods)

#11 Updated by Tim Pierce over 6 years ago

  • Story points changed from 1.0 to 2.0

#12 Updated by Tim Pierce over 6 years ago

  • Status changed from New to In Progress

#13 Updated by Tom Clegg over 6 years ago

Comment in block_work_list.go:

  • "Close() Shuts down the manager and the worker cleanly."
    • I think the phrase "and the worker" should be removed?
  • "The list manager continuously writes items to this channel."
    • Perhaps better: "Reading from this channel yields the next item from the list. (The returned item is also removed from the list.) If the list is empty, the channel blocks until new items are available."

In tests:

  • I think it would be worthwhile to test "Start worker; ReplaceList(1,2,3); wait until worker has done 3; ReplaceList(4,5,6); wait until worker has done 6" just to confirm that a worker can process the first list, block on the channel, and then resume work when the next list arrives. (Unless of course one of the existing tests does this and I missed it...)

(As discussed in chat) It seems a bit mysterious that we expose NextItem directly as a channel, but ReplaceList as a method whose sole purpose is to send something to a channel. I don't think it's a huge problem that we need to rush out and fix, but I wish I knew the rule of thumb for which is the better pattern.

#14 Updated by Tom Clegg over 6 years ago

#15 Updated by Radhika Chippada over 6 years ago

Tim,
I am reading through these changes as a learning exercise and had a few comments.

  • BlockWorkList: "BlockListManager" seems a better name for this class / file?
  • Is using camel case for variable names acceptable? For example, "newlist" at line 75? Also, elsewhere in the code you use names such as "current_list". Would it make sense to use either camel case or "_" consistently in our code and standardize on one of those formats? (I vote for camel case, as the go book seems to use it and more java'ish).
  • Usages such as "b *BlockWorkList": Can they be "blockList *BlockWorkList" or something like that for a clearer naming?
  • Should "handler_test.go" be "handlers_test.go? (missing s)
  • I know the following comment is too much, but I could not help noticing that all files in the keepstore directory are in package "main" (same story with keepproxy). I wish we could have sub-directories in here which would serve as actual packages such as "volume", "handler" etc and had only the main store implementation in main package.

Something as follows might have been nicer:

arvados/services/keep/store
arvados/services/keep/store/handler
arvados/services/keep/store/volume
arvados/services/keep/proxy/
arvados/services/keep/proxy/xxx

Thanks.

#16 Updated by Tim Pierce over 6 years ago

  • Status changed from In Progress to Resolved

Applied in changeset arvados|commit:4cfb296612f7b483b56c36f119ca175def706d2f.

Also available in: Atom PDF