Project

General

Profile

Actions

Idea #20521

open

Python tools emit logs from googleapiclient

Added by Brett Smith 11 months ago. Updated 11 months ago.

Status:
New
Priority:
Normal
Assigned To:
-
Category:
SDKs
Target version:
Story points:
-

Description

googleapiclient logs some useful information, like when requests are automatically retried. Any time we're writing out arvados logs, we should write googleapiclient logs to the same place.

In the main branch today, the Python SDK automatically enables logging as soon as it's imported. From sdk/python/arvados/__init__.py:

# Set up Arvados logging based on the user's configuration.
# All Arvados code should log under the arvados hierarchy.
log_format = '%(asctime)s %(name)s[%(process)d] %(levelname)s: %(message)s'
log_date_format = '%Y-%m-%d %H:%M:%S'
log_handler = logging.StreamHandler()
log_handler.setFormatter(logging.Formatter(log_format, log_date_format))
logger = logging.getLogger('arvados')
logger.addHandler(log_handler)
logger.setLevel(logging.DEBUG if config.get('ARVADOS_DEBUG')
                else logging.WARNING)

The simplest version would be to add a line to the end of this stanza:

_apiclient_logger = logging.getLogger('googleapiclient')
_apiclient_logger.addHandler(log_handler)
_apiclient_logger.setLevel(logger.getLevel())

However, a library messing with global state like this, without an explicit request, is kind of antisocial. This might be a good opportunity to add an explicit "setup logging" function to the PySDK. It can function similarly to logging.basicConfig: take a level and maybe a stream or handler as arguments, then set that up for arvados and other modules we want to hear from like googleapiclient. This would give us a good place to make other changes like this in the future, like adding other modules to the logging list, without messing with the global state of all code that says import arvados.


Related issues

Related to Arvados - Bug #20531: When getting logs from child job to print error, distinguish between job couldn't start and non-zero exitResolvedPeter Amstutz05/30/2023Actions
Related to Arvados - Bug #12684: Let user specify a retry strategy on the client object, used for all API callsResolvedBrett Smith05/09/2023Actions
Related to Arvados - Idea #20613: Reveal googleapiclient retry logs during client constructionResolvedBrett Smith06/07/2023Actions
Actions #1

Updated by Brett Smith 11 months ago

  • Description updated (diff)
Actions #3

Updated by Peter Amstutz 11 months ago

  • Related to Bug #20531: When getting logs from child job to print error, distinguish between job couldn't start and non-zero exit added
Actions #4

Updated by Brett Smith 11 months ago

  • Related to Bug #12684: Let user specify a retry strategy on the client object, used for all API calls added
Actions #5

Updated by Brett Smith 11 months ago

This story came out of #12684 when we increased the default retries for all of our Python stuff. You can imagine users might be bothered by their code/tool suddenly going quiet while API requests retry. Surfacing the googleapiclient logs, where these retries get logged, is the easiest way to address that.

#20531 specifically inhibits googleapiclient logs from a-c-r, except when debugging is specifically requested. And I think that's legitimate too. We have also seen cases where users see logs about retried operations, and they think it's a bigger issue than it really is.

I'm not totally set on this but just as a starting point for discussion, I think a good solution might be something like:

  • For all tools, if the user/admin requests debug logging, enable googleapiclient logs.
  • For noninteractive tools, do not enable googleapiclient logs by default. While the information is useful in principle, I think that's outweighed by the way they distract from the logs you "really" care about.
  • For interactive tools, what would be ideal is if we could detect a retry, and then emit a log saying something like, "Failed to do [task] in Arvados because of [tempfail reason]. I will keep trying, but this might take longer than usual." And probably throttle this message so it can only appear once per hour or something.

    This is probably doable by writing a custom log handler and installing it on googleapiclient. It could detect the retry message and then emit the message we want, and just ignore everything else. This is… much more work than just enabling googleapiclient logs, but not a silly amount of work.

The definition of "interactive" is probably sys.stdin.isatty() as a starting point. A lot of the Python tools get used in both interactive and noninteractive contexts, so I don't think it's a property of the tool itself, but the way that tool is being run.

Actions #6

Updated by Brett Smith 11 months ago

Brett Smith wrote in #note-5:

This is probably doable by writing a custom log handler and installing it on googleapiclient. It could detect the retry message and then emit the message we want, and just ignore everything else. This is… much more work than just enabling googleapiclient logs, but not a silly amount of work.

Correction: we want a filter object. For the logs it cares about, it can mutate those records and return True. For everything else, it can return False. It can implement the throttle logic too.

The filter API is not my favorite, but it's designed for exactly this, and the implementation would probably be simpler for us than a handler like I described above.

Actions #7

Updated by Peter Amstutz 11 months ago

  • Target version changed from Future to Development 2023-06-07
  • Assigned To set to Brett Smith
Actions #8

Updated by Brett Smith 11 months ago

  • Related to Idea #20613: Reveal googleapiclient retry logs during client construction added
Actions #9

Updated by Brett Smith 11 months ago

  • Target version changed from Development 2023-06-07 to Future
  • Assigned To deleted (Brett Smith)
Actions

Also available in: Atom PDF