Bug #2865
closedReduce Keep server memory use
Description
A job failed this morning when the Keep server had several block read errors at this time because it was out of memory. When the Python SDK couldn't get requested blocks from any Keep server, it translated that into a block not found exception. See the job log output. The input is fee29077095fed2e695100c299f11dc5+2727. Errors look like this:
2014-05-26_15:31:38 qr1hi-8i9sb-yf63mvltprdjwz7 20503 6 stderr Traceback (most recent call last): 2014-05-26_15:31:38 qr1hi-8i9sb-yf63mvltprdjwz7 20503 6 stderr File "/tmp/crunch-job/src/crunch_scripts/test/para/grep", line 18, in <module> 2014-05-26_15:31:38 qr1hi-8i9sb-yf63mvltprdjwz7 20503 6 stderr for line in input_file.readlines(): 2014-05-26_15:31:38 qr1hi-8i9sb-yf63mvltprdjwz7 20503 6 stderr File "/usr/local/lib/python2.7/dist-packages/arvados/stream.py", line 183, in readlines 2014-05-26_15:31:38 qr1hi-8i9sb-yf63mvltprdjwz7 20503 6 stderr for newdata in datasource: 2014-05-26_15:31:38 qr1hi-8i9sb-yf63mvltprdjwz7 20503 6 stderr File "/usr/local/lib/python2.7/dist-packages/arvados/stream.py", line 155, in readall 2014-05-26_15:31:38 qr1hi-8i9sb-yf63mvltprdjwz7 20503 6 stderr data = self.read(size) 2014-05-26_15:31:38 qr1hi-8i9sb-yf63mvltprdjwz7 20503 6 stderr File "/usr/local/lib/python2.7/dist-packages/arvados/stream.py", line 139, in read 2014-05-26_15:31:38 qr1hi-8i9sb-yf63mvltprdjwz7 20503 6 stderr data += self._stream.readfrom(locator+segmentoffset, segmentsize) 2014-05-26_15:31:38 qr1hi-8i9sb-yf63mvltprdjwz7 20503 6 stderr File "/usr/local/lib/python2.7/dist-packages/arvados/stream.py", line 265, in readfrom 2014-05-26_15:31:38 qr1hi-8i9sb-yf63mvltprdjwz7 20503 6 stderr data += self._keep.get(locator)[segmentoffset:segmentoffset+segmentsize] 2014-05-26_15:31:38 qr1hi-8i9sb-yf63mvltprdjwz7 20503 6 stderr File "/usr/local/lib/python2.7/dist-packages/arvados/keep.py", line 305, in get 2014-05-26_15:31:38 qr1hi-8i9sb-yf63mvltprdjwz7 20503 6 stderr raise arvados.errors.NotFoundError("Block not found: %s" % expect_hash) 2014-05-26_15:31:38 qr1hi-8i9sb-yf63mvltprdjwz7 20503 6 stderr arvados.errors.NotFoundError: Block not found: 43161251a3347a55e4a826daa730977f 2014-05-26_15:31:38 qr1hi-8i9sb-yf63mvltprdjwz7 20503 6 stderr srun: error: compute34: task 0: Exited with exit code 1
Updated by Brett Smith over 10 years ago
I have been unable to reproduce this so far. Running on compute34.qr1hi, using the same API token as the original job, I can run this in a Python interpreter, which I believe is functionally equivalent to what the grep script does:
import arvados cr = arvados.CollectionReader('fee29077095fed2e695100c299f11dc5+2727') for s in cr.all_streams(): for f in s.all_files(): cr = arvados.CollectionReader(f.as_manifest()) for crfile in cr.all_files(): for line in crfile.readlines(): pass
This runs to completion without raising an exception.
Updated by Brett Smith over 10 years ago
One weird discrepancy I can't figure out is that the line that raises this exception doesn't match what's in the backtrace (305):
compute34.qr1hi:~$ grep -nFC3 "Block not found" /usr/local/lib/python2.7/dist-packages/arvados/keep.py 300- 301- slot.set(None) 302- self.cap_cache() 303: raise arvados.errors.NotFoundError("Block not found: %s" % expect_hash) 304- 305- def get_url(self, url, headers, expect_hash): 306- h = httplib2.Http()
Updated by Brett Smith over 10 years ago
- Subject changed from grep job on qr1hi fails to find Keep blocks to Reduce Keep server memory use
- Description updated (diff)
Reviewing the logs indicates that the Keep server had several block read errors at this time because it was out of memory. When the Python SDK couldn't get requested blocks from any Keep server, it translated that into the Block not found exception.
Abram mentioned in IRC that he had garbage collection issues working with Go in the past, and found it helpful to make explicit collection requests when appropriate. Ward adds:
cure: imo keep should agressively free memory after each block has been served cure: because the kernel will cache it anyway in the disk cache, making subsequent requests fast cure: (maybe it already does, I haven't looked)
Updated by Tim Pierce over 10 years ago
- Status changed from New to In Progress
- Assigned To set to Tim Pierce
- Target version set to 2014-05-28 Pipeline Factory
Updated by Tim Pierce over 10 years ago
- Status changed from In Progress to Resolved
- % Done changed from 0 to 100
Applied in changeset arvados|commit:2b2adb421b9b82b75fe8a635442dfe8e1fab775a.
Updated by Ward Vandewege over 10 years ago
- Status changed from Resolved to In Progress
Updated by Ward Vandewege over 10 years ago
- Target version changed from 2014-05-28 Pipeline Factory to 2014-06-17 Curating and Crunch
Updated by Tim Pierce over 10 years ago
To bring this up to date, here is what we have found. Testing consists of storing ~3GB of data in Keep once, and then running repeated GET requests on random blocks, with one or more concurrent clients on the same machine.
- Base memory footprint for Keep when idle is ~5MB.
- After issuing the necessary PUT requests to Keep, the server's memory footprint is 260MB (256MB higher than base).
- With one client running serial GET requests, server memory footprint rises to 516MB (256MB higher than post-PUT).
- Each additional concurrent client adds roughly 128MB to server footprint
- After clients terminate, memory returns to baseline levels after 5-10 minutes.
Curiously, Go's own profiling tools do not account for the RSS reported by the kernel.
Peter has observed that ioutil.ReadAll grows its buffer exponentially, so when reading 64MB of data over the network, it will allocate a 128MB buffer. That appears likely to be partly responsible for bloating memory usage.
Next steps:
- Rewrite to avoid
ioutil.ReadAll
:- Rewrite
main.PutBlock
to stream data from the network directly to a temporary file instead of an in-memory buffer, checksumming as it goes. - Possibly: rewrite
main.GetBlock
to return a block even if the checksum does not match, and simply log an error if it does. Doing so will allow us to handle GET with a relatively small buffer. It is reasonable to expect the client to perform its own checksum anyway, so it is not necessarily unwise for Keep to return the source data.
- Rewrite
- (lower) Understand the discrepancy between "go tool pprof" and the RSS reported by top. My assumption is that this is an artifact of Go's allocator hanging on to this memory so that the next call to
make
will not require it to go back tomalloc(3)
but
Updated by Ward Vandewege over 10 years ago
I did some testing, and this branch looks good to me. Please merge.
For future reference I also experimented with replacing the
defer runtime.GC()
lines by
defer debug.FreeOSMemory()
which seems to lead to (much) lower but much more variable memory use. With 2 GET and 2 PUT clients accessing 2 different blocks with watch -n0.5, I saw a fairly stable RSS of 400M with this branch. When using FreeOSMemory instead, memory use was about half that on average, but much more variable (between 64M and 400M). CPU load seemed roughly 10% higher.
Updated by Tim Pierce over 10 years ago
- Status changed from In Progress to Resolved