Guide to writing custom reports with PySDK

Users often want to write reports like:

  • How many workflows were run in the past N days?
  • What was their final status? How many succeeded/failed/other?
  • What was their runtime? with basic stats like average/median/maximum
  • What did they cost? with basic stats like average/median/maximum

Today, assuming you navigate the documentation as the authors hope, the way you figure out to do this is:

  • You start on the Python SDK "Arvados API client" page, which introduces writing scripts.
  • You need to know how to write filters, so you follow the link to the list method API reference and read the reference documentation for every possible filter.
  • Then you proceed to the cookbook and read about how to do specific subtasks you need like listing project contents, finding a container from its container request, etc.
  • Then you draw the rest of the owl synthesize those recipes to write your report.

A dedicated guide would let us bridge some of the gaps here. It could provide more introductory-level documentation for:

  • The different list methods you might use (list, contents, shared) and when each one is appropriate
  • How to write filters, with examples of filters that are most common in reports
  • How to write select, with examples that select the fields most often used for reporting
  • Other tips like passing count='none'
  • How to report these out (e.g., show you can just feed Arvados objects to a csv.DictWriter)

