Project

General

Profile

Actions

Task #2507

closed

Bug #2449: New Keep server can write

Review 2449-keep-write-blocks

Added by Tim Pierce about 10 years ago. Updated about 10 years ago.

Status:
Closed
Priority:
Normal
Assigned To:
Tim Pierce

Actions #1

Updated by Ward Vandewege about 10 years ago

  • Assigned To set to Tim Pierce
Actions #2

Updated by Tim Pierce about 10 years ago

  • Assigned To changed from Tim Pierce to Ward Vandewege
  • Remaining (hours) changed from 2.0 to 0.5

PTAL: Committed 92f572b to address the code review: PutBlock now stores a block only on the first available volume, and unit tests have been updated to reflect that.

Actions #3

Updated by Tim Pierce about 10 years ago

  • Status changed from New to In Progress
Actions #4

Updated by Ward Vandewege about 10 years ago

  • Assigned To changed from Ward Vandewege to Tim Pierce
Actions #5

Updated by Tim Pierce about 10 years ago

  • Assigned To changed from Tim Pierce to Ward Vandewege

PTAL. This code now detects full volumes and MD5 collisions/corruption.

Actions #6

Updated by Ward Vandewege about 10 years ago

  • Assigned To changed from Ward Vandewege to Tom Clegg
Actions #7

Updated by Tom Clegg about 10 years ago

  • Assigned To changed from Tom Clegg to Tim Pierce
Actions #8

Updated by Tim Pierce about 10 years ago

  • Status changed from In Progress to Closed
  • Remaining (hours) changed from 0.5 to 0.0
Actions #9

Updated by Tim Pierce about 10 years ago

  • Status changed from Closed to In Progress
  • Parent task changed from #2449 to #2282
Actions #10

Updated by Tim Pierce about 10 years ago

  • Parent task changed from #2282 to #2449
Actions #11

Updated by Tim Pierce about 10 years ago

  • Assigned To changed from Tim Pierce to Misha Zatsman
  • Remaining (hours) changed from 0.0 to 0.5

One last thing... a fix for the network buffer issue I mentioned this morning.

New method ReadAtMost(reader, limit) to read up to limit bytes from a reader, or until EOF, whichever comes first. If there are more than limit bytes available, this returns an error. This method will read all of the data from large requests even if the network is buffering them.

I think it's desirable to signal an error if requests exceed 64MB in order to protect Keep against being overrun by DoS attacks.

Actions #12

Updated by Misha Zatsman about 10 years ago

  • Assigned To changed from Misha Zatsman to Tim Pierce
if buf, err := ReadAtMost(req.Body, BLOCKSIZE); err == nil {
  • Can we add a comment above this for people who'll ask why you can't just call req.Body.read(). Maybe:
  • // Use a buffered reader to issue multiple calls to req.Body.Read if necessary and enforce our size limits.

// ReadAtMost
// Returns a byte slice containing at most N bytes read
// from the specified io.Reader.

  • What about "Returns a byte slice containing all of the bytes returned by reader. If reader contains more than byte_limit bytes then an empty slice is returned and an error is returned instead."

func ReadAtMost(r io.Reader, limit int) ([]byte, error) {

  • Can we use more descriptive names like reader, byte_limit, and limit_reader?

    if len(buf) > limit {
    return buf[:limit], errors.New("Request Too Large")
    }

  • I'm a little worried about callers shooting themselves in the foot by ignoring the error response. What do you think about returning an empty slice in case of an error?
  • The above error message needs context outside of this method to make sense. Could you either change the name of the method to make the error message make sense (e.g. func ReadAtMostFromRequest(request_body io.Reader,... or change the error message/type and write this more specific message in the calling block.
  • Also since we know the max allowed BLOCKSIZE, we might as well tell it to the client by putting it in to the http error message they see.
Actions #13

Updated by Tim Pierce about 10 years ago

Misha Zatsman wrote:

if buf, err := ReadAtMost(req.Body, BLOCKSIZE); err == nil {
  • Can we add a comment above this for people who'll ask why you can't just call req.Body.read(). Maybe:
  • // Use a buffered reader to issue multiple calls to req.Body.Read if necessary and enforce our size limits.

Good idea, I've added a clarifying comment.

// ReadAtMost
// Returns a byte slice containing at most N bytes read
// from the specified io.Reader.

  • What about "Returns a byte slice containing all of the bytes returned by reader. If reader contains more than byte_limit bytes then an empty slice is returned and an error is returned instead."

Yeah, that's better. I expanded it a little more.

func ReadAtMost(r io.Reader, limit int) ([]byte, error) {

  • Can we use more descriptive names like reader, byte_limit, and limit_reader?

Is it actually confusing? When it's a short, well-defined code block, and a variable has a self-documenting type like io.Reader, I actually kind of prefer terse variables to verbose ones. The names don't clutter up the code as much. Maybe it's just because I've been reading a lot of Go code in the last few weeks, where there's a strong tendency to use metasyntactic variables like f for os.File objects, b for byte slices, s for strings and so on.

Does that make sense? I won't fight too hard about it but it's sort of refreshing. If this was more than about a dozen lines of code I'd definitely want to make the variables more descriptive.

if len(buf) > limit {
return buf[:limit], errors.New("Request Too Large")
}
  • I'm a little worried about callers shooting themselves in the foot by ignoring the error response. What do you think about returning an empty slice in case of an error?

Sure, I don't object to returning a nil slice in event of an error. I can see arguments either way, but we're really only using this method in one place, so it's not like it makes a huge difference. :-)

Note that ignoring the error response, while technically possible, is very strongly discouraged in Go. The compiler insists that you assign the error value to something, so you can't really forget that it's there, and the style guides are very firm about checking error status whenever it's present. If you post a snippet to go-nuts and you've ignored the returned error by assigning it to _, bradfitz will make fun of you.

  • The above error message needs context outside of this method to make sense. Could you either change the name of the method to make the error message make sense (e.g. func ReadAtMostFromRequest(request_body io.Reader,... or change the error message/type and write this more specific message in the calling block.

Good idea, how does "Too long" sound as a generic error here?

  • Also since we know the max allowed BLOCKSIZE, we might as well tell it to the client by putting it in to the http error message they see.

OK. The error handling has been a little simplistic and I want to define richer and clearer error types, but let me know how you like this for now.

Actions #14

Updated by Tim Pierce about 10 years ago

  • Assigned To changed from Tim Pierce to Misha Zatsman
Actions #15

Updated by Misha Zatsman about 10 years ago

  • Assigned To changed from Misha Zatsman to Tim Pierce

Tim Pierce wrote:

Misha Zatsman wrote:

if buf, err := ReadAtMost(req.Body, BLOCKSIZE); err == nil {
  • Can we add a comment above this for people who'll ask why you can't just call req.Body.read(). Maybe:
  • // Use a buffered reader to issue multiple calls to req.Body.Read if necessary and enforce our size limits.

Good idea, I've added a clarifying comment.

Thanks

// ReadAtMost
// Returns a byte slice containing at most N bytes read
// from the specified io.Reader.

  • What about "Returns a byte slice containing all of the bytes returned by reader. If reader contains more than byte_limit bytes then an empty slice is returned and an error is returned instead."

Yeah, that's better. I expanded it a little more.

Thanks

func ReadAtMost(r io.Reader, limit int) ([]byte, error) {

  • Can we use more descriptive names like reader, byte_limit, and limit_reader?

Is it actually confusing? When it's a short, well-defined code block, and a variable has a self-documenting type like io.Reader, I actually kind of prefer terse variables to verbose ones. The names don't clutter up the code as much. Maybe it's just because I've been reading a lot of Go code in the last few weeks, where there's a strong tendency to use metasyntactic variables like f for os.File objects, b for byte slices, s for strings and so on.

Does that make sense? I won't fight too hard about it but it's sort of refreshing. If this was more than about a dozen lines of code I'd definitely want to make the variables more descriptive.

Fair enough, that does sound nice if the method is simple enough, which this is.

if len(buf) > limit {
return buf[:limit], errors.New("Request Too Large")
}
  • I'm a little worried about callers shooting themselves in the foot by ignoring the error response. What do you think about returning an empty slice in case of an error?

Sure, I don't object to returning a nil slice in event of an error. I can see arguments either way, but we're really only using this method in one place, so it's not like it makes a huge difference. :-)

Note that ignoring the error response, while technically possible, is very strongly discouraged in Go. The compiler insists that you assign the error value to something, so you can't really forget that it's there, and the style guides are very firm about checking error status whenever it's present. If you post a snippet to go-nuts and you've ignored the returned error by assigning it to _, bradfitz will make fun of you.

Cool, that's good to know.

  • The above error message needs context outside of this method to make sense. Could you either change the name of the method to make the error message make sense (e.g. func ReadAtMostFromRequest(request_body io.Reader,... or change the error message/type and write this more specific message in the calling block.

Good idea, how does "Too long" sound as a generic error here?

Sounds great

  • Also since we know the max allowed BLOCKSIZE, we might as well tell it to the client by putting it in to the http error message they see.

OK. The error handling has been a little simplistic and I want to define richer and clearer error types, but let me know how you like this for now.

Looks good for now. I agree that the error handling seems simplistic (comparing an object for equality in particular seems bad, but maybe that's just because I'm new to go), but I don't know how to do it better.

Looks good to me.

Actions #16

Updated by Tim Pierce about 10 years ago

  • Status changed from In Progress to Resolved
  • % Done changed from 0 to 100
  • Remaining (hours) changed from 0.5 to 0.0

Applied in changeset arvados|commit:5df09e707c313fd88c32ad40f2d99030ecc2e639.

Actions #17

Updated by Tim Pierce about 10 years ago

  • Status changed from Resolved to Closed
Actions

Also available in: Atom PDF