Project

General

Profile

Bug #3147

Updated by Tom Clegg over 9 years ago

Client libraries to address: 
 * Perl (arguably most important, because crunch-job uses it) 
 * Python (second most important because most crunch scripts use it) 
 * Ruby 
 * arv-run-pipeline-instance (assuming it's still not using the Ruby SDK) 
 * Workbench (assuming it's still not using the Ruby SDK) 
 * arv (assuming it's still not using the Ruby SDK) 
 * Java 
 * Go 

 Desired behavior: 
 * Transactions that time out or produce 5xx errors should be reattempted after a delay 
 * Transactions that produce 4xx errors should not be reattempted 

 Background/example: 

 Keep services were restarted while doing an upload, which resulted in them being temporarily unavailable.    Arv-put fails in this case (and crashes with an exception) instead of retrying for a bit.  

 <pre> 
 Exception in thread Thread-72: 
 Traceback (most recent call last): 
   File "/usr/lib/python2.7/threading.py", line 552, in __bootstrap_inner 
     self.run() 
   File "/usr/local/lib/python2.7/dist-packages/arvados/keep.py", line 213, in run 
     body=self.args['data']) 
   File "/usr/local/lib/python2.7/dist-packages/httplib2/__init__.py", line 1593, in request 
     (response, content) = self._request(conn, authority, uri, request_uri, method, body, headers, redirections, cachekey) 
   File "/usr/local/lib/python2.7/dist-packages/httplib2/__init__.py", line 1335, in _request 
     (response, content) = self._conn_request(conn, request_uri, method, body, headers) 
   File "/usr/local/lib/python2.7/dist-packages/httplib2/__init__.py", line 1300, in _conn_request 
     conn.connect() 
   File "/usr/local/lib/python2.7/dist-packages/httplib2/__init__.py", line 913, in connect 
     raise socket.error, msg 
 error: [Errno 111] Connection refused 

 Traceback (most recent call last): 
   File "/usr/local/bin/arv-put", line 4, in <module> 
     main() 
   File "/usr/local/lib/python2.7/dist-packages/arvados/commands/put.py", line 376, in main 
     path, max_manifest_depth=args.max_manifest_depth) 
   File "/usr/local/lib/python2.7/dist-packages/arvados/commands/put.py", line 292, in write_directory_tree 
     path, stream_name, max_manifest_depth) 
   File "/usr/local/lib/python2.7/dist-packages/arvados/collection.py", line 270, in write_directory_tree 
     self.do_queued_work() 
   File "/usr/local/lib/python2.7/dist-packages/arvados/collection.py", line 197, in do_queued_work 
     self._work_file() 
   File "/usr/local/lib/python2.7/dist-packages/arvados/collection.py", line 210, in _work_file 
     self.write(buf) 
   File "/usr/local/lib/python2.7/dist-packages/arvados/collection.py", line 494, in write 
     return super(ResumableCollectionWriter, self).write(data) 
   File "/usr/local/lib/python2.7/dist-packages/arvados/collection.py", line 281, in write 
     self.flush_data() 
   File "/usr/local/lib/python2.7/dist-packages/arvados/commands/put.py", line 268, in flush_data 
     super(ArvPutCollectionWriter, self).flush_data() 
   File "/usr/local/lib/python2.7/dist-packages/arvados/collection.py", line 286, in flush_data 
     self._current_stream_locators += [Keep.put(data_buffer[0:self.KEEP_BLOCK_SIZE])] 
   File "/usr/local/lib/python2.7/dist-packages/arvados/keep.py", line 119, in put 
     return Keep.global_client_object().put(data, **kwargs) 
   File "/usr/local/lib/python2.7/dist-packages/arvados/keep.py", line 485, in put 
     (data_hash, want_copies, have_copies)) 
 arvados.errors.KeepWriteError: Write fail for 6afcd3c55f8c02043815464f33e4d52a: wanted 2 but wrote 1 
 </pre>

Back