Project

General

Profile

Bug #2865

Updated by Brett Smith almost 10 years ago

A job failed this morning when the Keep server had several block read errors at this time because it was out of memory.    When the Python SDK couldn't get requested blocks from any Keep server, it translated that into a block not found exception.    See "the job log output":https://workbench.qr1hi.arvadosapi.com/collections/6a6d2a9287031e55321913c87b6afd2c+85/qr1hi-8i9sb-yf63mvltprdjwz7.log.txt?disposition=inline&size=25193.    The input is fee29077095fed2e695100c299f11dc5+2727.    Errors look like this: 

 <pre> 
 2014-05-26_15:31:38 qr1hi-8i9sb-yf63mvltprdjwz7 20503 6 stderr Traceback (most recent call last): 
 2014-05-26_15:31:38 qr1hi-8i9sb-yf63mvltprdjwz7 20503 6 stderr     File "/tmp/crunch-job/src/crunch_scripts/test/para/grep", line 18, in <module> 
 2014-05-26_15:31:38 qr1hi-8i9sb-yf63mvltprdjwz7 20503 6 stderr       for line in input_file.readlines(): 
 2014-05-26_15:31:38 qr1hi-8i9sb-yf63mvltprdjwz7 20503 6 stderr     File "/usr/local/lib/python2.7/dist-packages/arvados/stream.py", line 183, in readlines 
 2014-05-26_15:31:38 qr1hi-8i9sb-yf63mvltprdjwz7 20503 6 stderr       for newdata in datasource: 
 2014-05-26_15:31:38 qr1hi-8i9sb-yf63mvltprdjwz7 20503 6 stderr     File "/usr/local/lib/python2.7/dist-packages/arvados/stream.py", line 155, in readall 
 2014-05-26_15:31:38 qr1hi-8i9sb-yf63mvltprdjwz7 20503 6 stderr       data = self.read(size) 
 2014-05-26_15:31:38 qr1hi-8i9sb-yf63mvltprdjwz7 20503 6 stderr     File "/usr/local/lib/python2.7/dist-packages/arvados/stream.py", line 139, in read 
 2014-05-26_15:31:38 qr1hi-8i9sb-yf63mvltprdjwz7 20503 6 stderr       data += self._stream.readfrom(locator+segmentoffset, segmentsize) 
 2014-05-26_15:31:38 qr1hi-8i9sb-yf63mvltprdjwz7 20503 6 stderr     File "/usr/local/lib/python2.7/dist-packages/arvados/stream.py", line 265, in readfrom 
 2014-05-26_15:31:38 qr1hi-8i9sb-yf63mvltprdjwz7 20503 6 stderr       data += self._keep.get(locator)[segmentoffset:segmentoffset+segmentsize] 
 2014-05-26_15:31:38 qr1hi-8i9sb-yf63mvltprdjwz7 20503 6 stderr     File "/usr/local/lib/python2.7/dist-packages/arvados/keep.py", line 305, in get 
 2014-05-26_15:31:38 qr1hi-8i9sb-yf63mvltprdjwz7 20503 6 stderr       raise arvados.errors.NotFoundError("Block not found: %s" % expect_hash) 
 2014-05-26_15:31:38 qr1hi-8i9sb-yf63mvltprdjwz7 20503 6 stderr arvados.errors.NotFoundError: Block not found: 43161251a3347a55e4a826daa730977f 
 2014-05-26_15:31:38 qr1hi-8i9sb-yf63mvltprdjwz7 20503 6 stderr srun: error: compute34: task 0: Exited with exit code 1 
 </pre> 

Back