Bug #7572

[SDKs] arv-put crashes with Broken Pipe socket.error after uploading 60GB

Added by Peter Grandi over 6 years ago. Updated over 6 years ago.

Status:
Closed
Priority:
Normal
Assigned To:
-
Category:
SDKs
Target version:
-
Start date:
10/15/2015
Due date:
% Done:

100%

Estimated time:
Story points:
-

Description

Testing Keepstore. Uploading with 'arv-put' midsize files like up to a few GB works.

First attempt to upload 60GB fails with 'arv-put' crashing at the 100% mark. Attempts to upload the same 60GB file also crash at the same point. Second and further attempts as expected show only read activity on the Keepstores: all blobs have been uploaded.

Context: recently installed, freshly updated setup. Not using SSO but direct token.

From API server log:

Started PUT "/arvados/v1/users/gcam1-tpzed-42l58gq9xqdzxkb" for 127.0.0.1 at 2015-10-13 14:52:32 +0000
Processing by Arvados::V1::UsersController#update as */*
  Parameters: {"api_token"=>"60l1om9jukg1y7qpu1a6uqeevd29zc9rruqe0yc3anbg3k6b7f", "reader_tokens"=>"[false]", "user"=>"{\"first_name\":\"Librarian\",\"prefs\":{\"getting_started_shown\":\"2015-09-09T12:05:04.170+00:00\"}}", "id"=>"gcam1-tpzed-42l58gq9xqdzxkb"}
WARNING: Can't verify CSRF token authenticity
  Rendered text template (0.0ms)
Completed 200 OK in 65.3ms (Views: 0.6ms | ActiveRecord: 9.3ms)

The crash report itself:

$ arv-put --replication 1 --no-resume --project-uuid gcam1-j7d0g-soxcyrmt3m87u2z --name dbNSFP2.5 dbNSFP2.5/
    60492M / 60492M 100.0% 
    Traceback (most recent call last):
      File "/usr/local/bin/arv-put", line 4, in <module> main()
      File "/usr/local/lib/python2.7/dist-packages/arvados/commands/put.py", line 517, in main
        ).execute(num_retries=args.retries)
      File "/usr/local/lib/python2.7/dist-packages/oauth2client/util.py", line 142, in positional_wrapper
        return wrapped(*args, **kwargs)
      File "/usr/local/lib/python2.7/dist-packages/googleapiclient/http.py", line 722, in execute
        body=self.body, headers=self.headers)
      File "/usr/local/lib/python2.7/dist-packages/arvados/api.py", line 54, in _intercept_http_request
        return self.orig_http_request(uri, **kwargs)
      File "/usr/local/lib/python2.7/dist-packages/httplib2/__init__.py", line 1609, in request
        (response, content) = self._request(conn, authority, uri, request_uri, method, body, headers, redirections, cachekey)
      File "/usr/local/lib/python2.7/dist-packages/httplib2/__init__.py", line 1351, in _request
        (response, content) = self._conn_request(conn, request_uri, method, body, headers)
      File "/usr/local/lib/python2.7/dist-packages/httplib2/__init__.py", line 1273, in _conn_request
        conn.request(method, request_uri, body, headers)
      File "/usr/lib/python2.7/httplib.py", line 979, in request
        self._send_request(method, url, body, headers)
      File "/usr/lib/python2.7/httplib.py", line 1013, in _send_request
        self.endheaders(body)
      File "/usr/lib/python2.7/httplib.py", line 975, in endheaders
        self._send_output(message_body)
      File "/usr/lib/python2.7/httplib.py", line 835, in _send_output
        self.send(msg)
      File "/usr/lib/python2.7/httplib.py", line 811, in send
        self.sock.sendall(data)
      File "/usr/lib/python2.7/ssl.py", line 329, in sendall
        v = self.send(data[count:])
      File "/usr/lib/python2.7/ssl.py", line 298, in send
        v = self._sslobj.write(data)
    socket.error: [Errno 32] Broken pipe

My current wild guess is some kind of timeout.


Related issues

Related to Arvados - Bug #7587: [SDKs] Python Google API client raises Broken Pipe socket.error after sitting idle for some timeResolved10/22/2015

History

#1 Updated by Brett Smith over 6 years ago

  • Subject changed from Keepstore: crash at end of uploading 60GB file to [SDKs] arv-put crashes with Broken Pipe socket.error after uploading 60GB
  • Category set to SDKs

Peter,

Thanks for reporting this. We've actually seen the same issue in other contexts, including long-running jobs. #7587 has become the umbrella issue where we're trying to track down the root cause. If you want to get updates about it, please add yourself as a watcher to that issue.

#2 Updated by Peter Grandi over 6 years ago

  • % Done changed from 0 to 100
  • Status changed from New to Closed

We have installed Debian package python-arvados-python-client-0.1.20151024192127 which presumably contains the fix mentioned in https://dev.arvados.org/issues/7587#note-15 and uploads of 60GB files now succeed. Many thanks for looking into this (and the other issue).

Also available in: Atom PDF