Project

General

Profile

Actions

Bug #20611

closed

Creating api object hangs when inside crunch container

Added by Peter Amstutz 11 months ago. Updated 11 months ago.

Status:
Resolved
Priority:
Normal
Assigned To:
Category:
SDKs
Story points:
-

Description

Simple python script

Without "api = arvados.api()" it works fine

With that added, it apparently hangs. Example:

https://workbench2.scale.arvadosapi.com/processes/scale-xvhdp-f9kur2tybompfb1

Also

Traceback (most recent call last):
  File "/home/peter/work/arvados/sdk/cwl/arvados_cwl/arvcontainer.py", line 502, in done
    "%s (%s) error log:" % (label, record["uuid"]), maxlen=40, include_crunchrun=(rcode is None or rcode > 127))
UnboundLocalError: local variable 'rcode' referenced before assignment

Here's the dumb little script I'm using

import time
import sys

ticks = int(sys.argv[1])

print("ticking %s times" % ticks, flush=True)

import logging

logging.getLogger('googleapiclient').setLevel(logging.DEBUG)

import arvados
print("imported", flush=True)
api = arvados.api()
print("api'd", flush=True)

for i in range(1, ticks+1):
    time.sleep(10)
    print("tick %s / %s" % (i, ticks), flush=True)

print("done", flush=True)

The last thing it prints is "imported" so we know the import works but getting an API object does not.

Update, set num_retries=0, DNS lookup is failing:

  2023-06-05T22:37:54.507877037Z stderr   File "/usr/share/python3/dist/python3-arvados-cwl-runner/lib/python3.7/site-packages/arvados/safeapi.py", line 63, in localapi
  2023-06-05T22:37:54.507877037Z stderr     client = self.local.api
  2023-06-05T22:37:54.507877037Z stderr AttributeError: '_thread._local' object has no attribute 'api'
  2023-06-05T22:37:54.507877037Z stderr
  2023-06-05T22:37:54.507877037Z stderr During handling of the above exception, another exception occurred:
  2023-06-05T22:37:54.507877037Z stderr
  2023-06-05T22:37:54.507877037Z stderr Traceback (most recent call last):
  2023-06-05T22:37:54.507877037Z stderr   File "/keep/da8dcd92f2b8ab2860005f51fcabdf46+53/idle.py", line 19, in <module>
  2023-06-05T22:37:54.507877037Z stderr     api = arvados.api(num_retries=0)
  2023-06-05T22:37:54.507877037Z stderr   File "/usr/share/python3/dist/python3-arvados-cwl-runner/lib/python3.7/site-packages/arvados/api.py", line 431, in api
  2023-06-05T22:37:54.507877037Z stderr     return ThreadSafeApiCache({}, {}, kwargs, version)
  2023-06-05T22:37:54.507877037Z stderr   File "/usr/share/python3/dist/python3-arvados-cwl-runner/lib/python3.7/site-packages/arvados/safeapi.py", line 59, in __init__
  2023-06-05T22:37:54.507877037Z stderr     self.keep = keep.KeepClient(api_client=self, **keep_params)
  2023-06-05T22:37:54.507877037Z stderr   File "/usr/share/python3/dist/python3-arvados-cwl-runner/lib/python3.7/site-packages/arvados/keep.py", line 913, in __init__
  2023-06-05T22:37:54.507877037Z stderr     self.insecure = api_client.insecure
  2023-06-05T22:37:54.507877037Z stderr   File "/usr/share/python3/dist/python3-arvados-cwl-runner/lib/python3.7/site-packages/arvados/safeapi.py", line 72, in __getattr__
  2023-06-05T22:37:54.507877037Z stderr     return getattr(self.localapi(), name)
  2023-06-05T22:37:54.507877037Z stderr   File "/usr/share/python3/dist/python3-arvados-cwl-runner/lib/python3.7/site-packages/arvados/safeapi.py", line 65, in localapi
  2023-06-05T22:37:54.507877037Z stderr     client = api.api_client(**self._api_kwargs)
  2023-06-05T22:37:54.507877037Z stderr   File "/usr/share/python3/dist/python3-arvados-cwl-runner/lib/python3.7/site-packages/arvados/api.py", line 261, in api_client
  2023-06-05T22:37:54.507877037Z stderr     **kwargs,
  2023-06-05T22:37:54.507877037Z stderr   File "/usr/share/python3/dist/python3-arvados-cwl-runner/lib/python3.7/site-packages/googleapiclient/_helpers.py", line 130, in positional_wrapper
  2023-06-05T22:37:54.507877037Z stderr     return wrapped(*args, **kwargs)
  2023-06-05T22:37:54.507877037Z stderr   File "/usr/share/python3/dist/python3-arvados-cwl-runner/lib/python3.7/site-packages/googleapiclient/discovery.py", line 296, in build
  2023-06-05T22:37:54.507877037Z stderr     static_discovery=static_discovery,
  2023-06-05T22:37:54.507877037Z stderr   File "/usr/share/python3/dist/python3-arvados-cwl-runner/lib/python3.7/site-packages/googleapiclient/discovery.py", line 422, in _retrieve_discovery_doc
  2023-06-05T22:37:54.507877037Z stderr     resp, content = req.execute(num_retries=num_retries)
  2023-06-05T22:37:54.507877037Z stderr   File "/usr/share/python3/dist/python3-arvados-cwl-runner/lib/python3.7/site-packages/googleapiclient/_helpers.py", line 130, in positional_wrapper
  2023-06-05T22:37:54.507877037Z stderr     return wrapped(*args, **kwargs)
  2023-06-05T22:37:54.507877037Z stderr   File "/usr/share/python3/dist/python3-arvados-cwl-runner/lib/python3.7/site-packages/googleapiclient/http.py", line 932, in execute
  2023-06-05T22:37:54.507877037Z stderr     headers=self.headers,
  2023-06-05T22:37:54.507877037Z stderr   File "/usr/share/python3/dist/python3-arvados-cwl-runner/lib/python3.7/site-packages/arvados/api.py", line 87, in _retry_request
  2023-06-05T22:37:54.507877037Z stderr     response, body = _orig_retry_request(http, num_retries, *args, **kwargs)
  2023-06-05T22:37:54.507877037Z stderr   File "/usr/share/python3/dist/python3-arvados-cwl-runner/lib/python3.7/site-packages/googleapiclient/http.py", line 222, in _retry_request
  2023-06-05T22:37:54.507877037Z stderr     raise exception
  2023-06-05T22:37:54.507877037Z stderr   File "/usr/share/python3/dist/python3-arvados-cwl-runner/lib/python3.7/site-packages/googleapiclient/http.py", line 191, in _retry_request
  2023-06-05T22:37:54.507877037Z stderr     resp, content = http.request(uri, method, *args, **kwargs)
  2023-06-05T22:37:54.507877037Z stderr   File "/usr/share/python3/dist/python3-arvados-cwl-runner/lib/python3.7/site-packages/arvados/api.py", line 132, in _intercept_http_request
  2023-06-05T22:37:54.507877037Z stderr     raise type(e)(*e.args)
  2023-06-05T22:37:54.507877037Z stderr httplib2.error.ServerNotFoundError: [req-89hr1neqwdmaksp8hzhj] Unable to find the server at scale.arvadosapi.com


Related issues

Related to Arvados - Idea #20613: Reveal googleapiclient retry logs during client constructionResolvedBrett Smith06/07/2023Actions
Actions #1

Updated by Peter Amstutz 11 months ago

  • Status changed from New to In Progress
Actions #2

Updated by Peter Amstutz 11 months ago

  • Description updated (diff)
Actions #3

Updated by Peter Amstutz 11 months ago

  • Description updated (diff)
Actions #4

Updated by Peter Amstutz 11 months ago

  • Description updated (diff)
Actions #5

Updated by Brett Smith 11 months ago

httplib2.error.ServerNotFoundError: [req-89hr1neqwdmaksp8hzhj] Unable to find the server at scale.arvadosapi.com

Confirming we "knew" in advance this is one of the errors googleapiclient retries. See #12684#note-22 and the relevant source.

This is a specific example of the kind of problem I was worried about in #12684#note-29: the new retry strategy makes a regular old DNS problem seem much bigger.

Actions #6

Updated by Peter Amstutz 11 months ago

  • Related to Idea #20613: Reveal googleapiclient retry logs during client construction added
Actions #7

Updated by Peter Amstutz 11 months ago

  • Target version changed from Development 2023-06-07 to Development 2023-06-21 sprint
Actions #8

Updated by Peter Amstutz 11 months ago

The DNS issue was resolved by building a new AMI for that cluster.

Actions #9

Updated by Peter Amstutz 11 months ago

  • Status changed from In Progress to Resolved
Actions

Also available in: Atom PDF