Task #2507
closedBug #2449: New Keep server can write
Review 2449-keep-write-blocks
Added by Tim Pierce almost 11 years ago. Updated over 10 years ago.
Description
Code review for 2449-keep-write-blocks: https://arvados.org/projects/arvados/repository/diff?utf8=%E2%9C%93&rev=f6b24b30e9f3794943f628c22ee4dabcadca5a1a&rev_to=ad0944aa970bf52bcc2387ac57e12cb69ae54d66
Updated by Tim Pierce over 10 years ago
- Assigned To changed from Tim Pierce to Ward Vandewege
- Remaining (hours) changed from 2.0 to 0.5
PTAL: Committed 92f572b to address the code review: PutBlock now stores a block only on the first available volume, and unit tests have been updated to reflect that.
- All differences between this branch and master:
- Differences since last code review:
Updated by Ward Vandewege over 10 years ago
- Assigned To changed from Ward Vandewege to Tim Pierce
Updated by Tim Pierce over 10 years ago
- Assigned To changed from Tim Pierce to Ward Vandewege
PTAL. This code now detects full volumes and MD5 collisions/corruption.
Updated by Ward Vandewege over 10 years ago
- Assigned To changed from Ward Vandewege to Tom Clegg
Updated by Tom Clegg over 10 years ago
- Assigned To changed from Tom Clegg to Tim Pierce
Updated by Tim Pierce over 10 years ago
- Status changed from In Progress to Closed
- Remaining (hours) changed from 0.5 to 0.0
Updated by Tim Pierce over 10 years ago
- Status changed from Closed to In Progress
- Parent task changed from #2449 to #2282
Updated by Tim Pierce over 10 years ago
- Parent task changed from #2282 to #2449
Updated by Tim Pierce over 10 years ago
- Assigned To changed from Tim Pierce to Misha Zatsman
- Remaining (hours) changed from 0.0 to 0.5
One last thing... a fix for the network buffer issue I mentioned this morning.
New method ReadAtMost(reader, limit)
to read up to limit
bytes from a reader, or until EOF, whichever comes first. If there are more than limit
bytes available, this returns an error. This method will read all of the data from large requests even if the network is buffering them.
I think it's desirable to signal an error if requests exceed 64MB in order to protect Keep against being overrun by DoS attacks.
Updated by Misha Zatsman over 10 years ago
- Assigned To changed from Misha Zatsman to Tim Pierce
- Can we add a comment above this for people who'll ask why you can't just call req.Body.read(). Maybe:
- // Use a buffered reader to issue multiple calls to req.Body.Read if necessary and enforce our size limits.
// ReadAtMost
// Returns a byte slice containing at most N bytes read
// from the specified io.Reader.
- What about "Returns a byte slice containing all of the bytes returned by reader. If reader contains more than byte_limit bytes then an empty slice is returned and an error is returned instead."
func ReadAtMost(r io.Reader, limit int) ([]byte, error) {
- Can we use more descriptive names like reader, byte_limit, and limit_reader?
if len(buf) > limit {
return buf[:limit], errors.New("Request Too Large")
} - I'm a little worried about callers shooting themselves in the foot by ignoring the error response. What do you think about returning an empty slice in case of an error?
- The above error message needs context outside of this method to make sense. Could you either change the name of the method to make the error message make sense (e.g. func ReadAtMostFromRequest(request_body io.Reader,... or change the error message/type and write this more specific message in the calling block.
- Also since we know the max allowed BLOCKSIZE, we might as well tell it to the client by putting it in to the http error message they see.
Updated by Tim Pierce over 10 years ago
Misha Zatsman wrote:
if buf, err := ReadAtMost(req.Body, BLOCKSIZE); err == nil {
- Can we add a comment above this for people who'll ask why you can't just call req.Body.read(). Maybe:
- // Use a buffered reader to issue multiple calls to req.Body.Read if necessary and enforce our size limits.
Good idea, I've added a clarifying comment.
// ReadAtMost
// Returns a byte slice containing at most N bytes read
// from the specified io.Reader.
- What about "Returns a byte slice containing all of the bytes returned by reader. If reader contains more than byte_limit bytes then an empty slice is returned and an error is returned instead."
Yeah, that's better. I expanded it a little more.
func ReadAtMost(r io.Reader, limit int) ([]byte, error) {
- Can we use more descriptive names like reader, byte_limit, and limit_reader?
Is it actually confusing? When it's a short, well-defined code block, and a variable has a self-documenting type like io.Reader
, I actually kind of prefer terse variables to verbose ones. The names don't clutter up the code as much. Maybe it's just because I've been reading a lot of Go code in the last few weeks, where there's a strong tendency to use metasyntactic variables like f
for os.File
objects, b
for byte slices, s
for strings and so on.
Does that make sense? I won't fight too hard about it but it's sort of refreshing. If this was more than about a dozen lines of code I'd definitely want to make the variables more descriptive.
if len(buf) > limit {
return buf[:limit], errors.New("Request Too Large")
}
- I'm a little worried about callers shooting themselves in the foot by ignoring the error response. What do you think about returning an empty slice in case of an error?
Sure, I don't object to returning a nil slice in event of an error. I can see arguments either way, but we're really only using this method in one place, so it's not like it makes a huge difference. :-)
Note that ignoring the error response, while technically possible, is very strongly discouraged in Go. The compiler insists that you assign the error value to something, so you can't really forget that it's there, and the style guides are very firm about checking error status whenever it's present. If you post a snippet to go-nuts and you've ignored the returned error by assigning it to _, bradfitz will make fun of you.
- The above error message needs context outside of this method to make sense. Could you either change the name of the method to make the error message make sense (e.g. func ReadAtMostFromRequest(request_body io.Reader,... or change the error message/type and write this more specific message in the calling block.
Good idea, how does "Too long" sound as a generic error here?
- Also since we know the max allowed BLOCKSIZE, we might as well tell it to the client by putting it in to the http error message they see.
OK. The error handling has been a little simplistic and I want to define richer and clearer error types, but let me know how you like this for now.
Updated by Tim Pierce over 10 years ago
- Assigned To changed from Tim Pierce to Misha Zatsman
Updated by Misha Zatsman over 10 years ago
- Assigned To changed from Misha Zatsman to Tim Pierce
Tim Pierce wrote:
Misha Zatsman wrote:
if buf, err := ReadAtMost(req.Body, BLOCKSIZE); err == nil {
- Can we add a comment above this for people who'll ask why you can't just call req.Body.read(). Maybe:
- // Use a buffered reader to issue multiple calls to req.Body.Read if necessary and enforce our size limits.
Good idea, I've added a clarifying comment.
Thanks
// ReadAtMost
// Returns a byte slice containing at most N bytes read
// from the specified io.Reader.
- What about "Returns a byte slice containing all of the bytes returned by reader. If reader contains more than byte_limit bytes then an empty slice is returned and an error is returned instead."
Yeah, that's better. I expanded it a little more.
Thanks
func ReadAtMost(r io.Reader, limit int) ([]byte, error) {
- Can we use more descriptive names like reader, byte_limit, and limit_reader?
Is it actually confusing? When it's a short, well-defined code block, and a variable has a self-documenting type like
io.Reader
, I actually kind of prefer terse variables to verbose ones. The names don't clutter up the code as much. Maybe it's just because I've been reading a lot of Go code in the last few weeks, where there's a strong tendency to use metasyntactic variables likef
foros.File
objects,b
for byte slices,s
for strings and so on.Does that make sense? I won't fight too hard about it but it's sort of refreshing. If this was more than about a dozen lines of code I'd definitely want to make the variables more descriptive.
Fair enough, that does sound nice if the method is simple enough, which this is.
if len(buf) > limit {
return buf[:limit], errors.New("Request Too Large")
}
- I'm a little worried about callers shooting themselves in the foot by ignoring the error response. What do you think about returning an empty slice in case of an error?
Sure, I don't object to returning a nil slice in event of an error. I can see arguments either way, but we're really only using this method in one place, so it's not like it makes a huge difference. :-)
Note that ignoring the error response, while technically possible, is very strongly discouraged in Go. The compiler insists that you assign the error value to something, so you can't really forget that it's there, and the style guides are very firm about checking error status whenever it's present. If you post a snippet to go-nuts and you've ignored the returned error by assigning it to _, bradfitz will make fun of you.
Cool, that's good to know.
- The above error message needs context outside of this method to make sense. Could you either change the name of the method to make the error message make sense (e.g. func ReadAtMostFromRequest(request_body io.Reader,... or change the error message/type and write this more specific message in the calling block.
Good idea, how does "Too long" sound as a generic error here?
Sounds great
- Also since we know the max allowed BLOCKSIZE, we might as well tell it to the client by putting it in to the http error message they see.
OK. The error handling has been a little simplistic and I want to define richer and clearer error types, but let me know how you like this for now.
Looks good for now. I agree that the error handling seems simplistic (comparing an object for equality in particular seems bad, but maybe that's just because I'm new to go), but I don't know how to do it better.
Looks good to me.
Updated by Tim Pierce over 10 years ago
- Status changed from In Progress to Resolved
- % Done changed from 0 to 100
- Remaining (hours) changed from 0.5 to 0.0
Applied in changeset arvados|commit:5df09e707c313fd88c32ad40f2d99030ecc2e639.