Project

General

Profile

Actions

Idea #21017

open

Convert cookbook container recipes into SDK methods

Added by Brett Smith 7 months ago. Updated 3 months ago.

Status:
In Progress
Priority:
Normal
Assigned To:
Category:
SDKs
Target version:
Start date:
Due date:
Story points:
1.0

Description

A lot of code in the Python SDK cookbook that coordinates low-level API calls should just be methods that are defined in the SDK itself. Why force every user to copy-paste these code snippets when we can just give them the method directly?

Here are common general-purpose container/request functions that could be useful. I'm not sure what module they should go in. It might make sense to add them to a new arvados.container module.

def lookup(
        client: ArvadosAPIClient,
        container: str | Container | ContainerRequest,
) -> Container | None: ...
# If container is a str, it's a UUID for one of the other types, fetch that.
# If it's a container request, return None if container_uuid is None,
# else fetch that.
# Return the resulting container.
# Note that lookup should return None *only* if it handles a container
# request with no container_uuid set.
# Other problems like malformed UUID, other object types, container request
# without container_uuid selected, etc. should all raise their corresponding
# exceptions.

def container_started(
        client: ArvadosAPIClient,
        container: str | Container | ContainerRequest,
) -> bool:
    container_obj = lookup(client, container)
    if container_obj is None:
        return False
    else:
        return container_obj['status'] not in {'Queued', 'Locked'}

def container_finished(
        client: ArvadosAPIClient,
        container: str | Container | ContainerRequest,
) -> bool:
    container_obj = lookup(client, container)
    if container_obj is None:
        return False
    else:
        return container_obj['status'] in {'Cancelled', 'Complete'}

def container_succeeded(
        client: ArvadosAPIClient,
        container: str | Container | ContainerRequest,
        success: typing.Container[int]=frozenset([0]),
) -> bool:
    container_obj = lookup(client, container)
    return (
        container_obj is not None
        and container_obj['status'] == 'Complete'
        and container_obj['exit_code'] in success
    )

def child_requests(
        client: ArvadosAPIClient,
        container: str | Container | ContainerRequest,
        filters: list[list[str]]=[],
        select: list[str]=[],
        depth: int | None=None,
) -> Iterator[ContainerRequest]: ...
# lookup container, yield from `keyset_list_all` calls, down to depth if given

def child_containers(
        client: ArvadosAPIClient,
        container: str | Container | ContainerRequest,
        request_filters: list[list[str]]=[],
        container_filters: list[list[str]]=[],
        select: list[str]=[],
        depth: int | None=None,
) -> Iterator[Container]: ...
# Same dance as above, call `child_requests` to get container_uuids, then
# yield from `keyset_list_all` calls with those UUIDs, down to depth if given.
# Note that `child_requests` can be called with a very narrow `select` for
# optimization, and can add the filter `container_uuid is not null`.

Files

container.py (8.73 KB) container.py Draft implementation Brett Smith, 11/28/2023 03:38 PM

Subtasks 1 (1 open0 closed)

Task #21247: ReviewNewPeter AmstutzActions

Related issues

Related to Arvados - Idea #21024: PySDK includes basic workflow reporting scriptNewActions
Actions #1

Updated by Brett Smith 7 months ago

  • Description updated (diff)
Actions #2

Updated by Brett Smith 7 months ago

  • Description updated (diff)
Actions #4

Updated by Brett Smith 5 months ago

Here's a draft implementation I did because I needed it for something else anyway. An actual branch would need:

  • Pre-discussion and agreement on the API, including names of everything
  • Tests
  • Docstrings formatted to our standards (hopefully the text is still useful at least)
  • The __main__ block removed (it's basically a cheap version of #21024)
Actions #5

Updated by Brett Smith 5 months ago

  • Related to Idea #21024: PySDK includes basic workflow reporting script added
Actions #6

Updated by Peter Amstutz 5 months ago

  • Target version changed from Future to Development 2024-01-03 sprint
Actions #7

Updated by Brett Smith 5 months ago

  • Story points set to 1.0
  • Assigned To set to Brett Smith
  • Status changed from New to In Progress
Actions #8

Updated by Peter Amstutz 5 months ago

Initial thoughts:

  • I would have a single container_status() method which returns an enum that lines up with the states presented to the user in Workbench 2.
  • In general, users only care about container_requests and the container object is an implementation detail that usually leads to confusion

Based on the cookbook, some additional methods to consider:

  • get_cwl_input() -- return mounts[cwl.input.json] and/or properties[cwl_input]
  • get_cwl_output() -- return output[cwl.output.json] and/or properties[cwl_output]
  • get_log_file() -- get a file from the log collection
  • list_containers() -- optional 'name', 'project_uuid' and 'status' which accepts the higher-level status enum used by the proposed container_status(), as well as other arbitrary filters
  • container_cancel() -- set priority to zero
  • copy_container() -- make a copy of the container request in "uncommitted" state
  • run_container() -- commit a container request
Actions #9

Updated by Peter Amstutz 5 months ago

Takeaways from meeting Dec 6:

  • A higher level API should align conceptually with how things are presented to the user in Workbench 2
    • This is because excepted thought process of a end user using the Python SDK would be "I understand how to do a task in workbench, how can I automate it"
  • The proposed functionality should be defined as a class (perhaps with static methods) that flattens the ContainerRequest/Container distinction.
  • We need to choose our preferred terminology and use it consistently across WB2 and the SDK. We agree that "container request" is not meaningful to end users. Workbench 2 currently uses "Process/Subprocess" and "Workflow run/Workflow step" in various places.

Side discussion:

  • It be neat if we made containers lightweight enough (both from a UI and dispatching point of view) so the obvious way to do operations like report generation is to launch and results viewed from within Workbench.
Actions #10

Updated by Peter Amstutz 4 months ago

  • Target version changed from Development 2024-01-03 sprint to Development 2024-01-17 sprint
Actions #11

Updated by Peter Amstutz 4 months ago

  • Target version changed from Development 2024-01-17 sprint to Development 2024-01-31 sprint
Actions #12

Updated by Peter Amstutz 3 months ago

  • Target version changed from Development 2024-01-31 sprint to Future
Actions

Also available in: Atom PDF