Idea #19791: Write Python SDK overview - Arvados

Actions

Copy link

Idea #19791

closed

Write Python SDK overview

Added by Brett Smith over 2 years ago. Updated over 2 years ago.

Status:

Resolved

Priority:

Normal

Assigned To:

Brett Smith

Category:

SDKs

Target version:

2022-12-07 Sprint

Start date:

11/25/2022

Due date:

Story points:

Release:

Arvados 2.5.0

Release relationship:

Auto

Description

Write a document that explains:

The Python SDK dynamically implements interfaces from the Arvados API.
Every resource type has a corresponding class.
API methods can be called as instance methods.
Arguments to methods are passed as keyword arguments.
Explain some of the abstractions like calling .execute().
Link to documentation for the Arvados API and google-api-python-client for further reading as appropriate.

The goal is for this to be a single document that helps people understand the patterns that the API client follows and how it relates to the Arvados API.

IMO you can implement this by building out the current "Examples" page of the documentation. The orientation should be high-level enough to stay useful even after we have full reference documentation from issue #18799.

Subtasks 1 (1 open — 0 closed)

Related issues 1 (0 open — 1 closed)

Actions

Copy link

Updated by Peter Amstutz over 2 years ago

Target version set to 2022-12-07 Sprint

Actions

Copy link

Updated by Peter Amstutz over 2 years ago

Assigned To set to Brett Smith

Actions

Copy link

Updated by Brett Smith over 2 years ago

Status changed from New to In Progress

Actions

Copy link

Updated by Peter Amstutz over 2 years ago

In the get section, I would split the 1st and 2nd examples into two code blocks.

In the list section, it would be better to direct people to use arvados.util.keyset_list_all to handle paging instead of doing it manually.

I think an additional section that describes using group().contents() to get project contents would be a good idea.

For create it's better to have the object contents in a field corresponding to the object type, so instead of this:

project = arv_client.groups().create(
    body={
        'name': 'Python SDK Test Project',
        'group_class': 'project',
    },
    ensure_unique_name=True,
).execute()

Do this:

project = arv_client.groups().create(
    body={
        'group': {
            'name': 'Python SDK Test Project',
            'group_class': 'project'
        }
    },
    ensure_unique_name=True,
).execute()

(background here is that the second one is more correct to the API definition, the 1st one works because the Rails API does a fixup that moves fields from the top level into the object, but runs a small risk of confusion between top level parameters like 'limit' and 'select' and the object contents).

To modify an existing Arvados object, call the delete method for that resource type.

I think you meant "remove an existing Arvados object"

I don't think you want to go into the weeds of setting trash_at here. It would probably be simpler to just explain that some object types go into the trash and can be recovered, and some are deleted immediately. The ones that can be removed from the trash have an untrash method.

Actions

Copy link

Updated by Brett Smith over 2 years ago

All addressed @ 7abef669c except:

Peter Amstutz wrote in #note-4:

I think an additional section that describes using group().contents() to get project contents would be a good idea.

I think it would provide the best organization to leave this for the planned cookbook update. Right now this page covers just the API's common resource methods, and then mentions multiple times that you can use the same patterns to call other API methods. If we start covering those other methods, I don't know how we draw the line of which are worth including and which aren't. The cookbook already covers that subjective space of "stuff people ask about a lot," so let's just cover it there. (We might consider adding users/current too: that was covered in the old "Examples" page and I took it out for the same reasons.)

Actions

Copy link

Updated by Peter Amstutz over 2 years ago

Brett Smith wrote in #note-5:

All addressed @ 7abef669c except:

Peter Amstutz wrote in #note-4:

I think an additional section that describes using group().contents() to get project contents would be a good idea.

I think it would provide the best organization to leave this for the planned cookbook update. Right now this page covers just the API's common resource methods, and then mentions multiple times that you can use the same patterns to call other API methods. If we start covering those other methods, I don't know how we draw the line of which are worth including and which aren't. The cookbook already covers that subjective space of "stuff people ask about a lot," so let's just cover it there. (We might consider adding users/current too: that was covered in the old "Examples" page and I took it out for the same reasons.)

Sounds good.

If you need to retrieve all of the results for a list, you need to call the list method multiple times with the same search criteria and increasing offset arguments until no more items are returned.

This is slightly wrong, keyset_list_all doesn't use offset, it uses > on order_key (this gives a more stable result if records change mid-paging).

Probably also be worth mentioning that while limit sets the maximum page size, and the default is 100, the value of limit is clamped to 1000, so if you expect there could possibly be more than 1000 items, you definitely need to use keyset_list_all.

Rest LGTM

Actions

Copy link

Updated by Brett Smith over 2 years ago

Peter Amstutz wrote in #note-6:

Sounds good.

Cool, I added a note to this effect in a comment to try to help keep future writing in line with this scope.

If you need to retrieve all of the results for a list, you need to call the list method multiple times with the same search criteria and increasing offset arguments until no more items are returned.

This is slightly wrong, keyset_list_all doesn't use offset, it uses > on order_key (this gives a more stable result if records change mid-paging).

I just took this out, I feel like this rationale is left over from the previous draft and not really helpful given the recommendation to just use keyset_list_all.

Probably also be worth mentioning that while limit sets the maximum page size, and the default is 100, the value of limit is clamped to 1000, so if you expect there could possibly be more than 1000 items, you definitely need to use keyset_list_all.

Done @ 9e44b1581.

Actions

Copy link

Updated by Peter Amstutz over 2 years ago

Brett Smith wrote in #note-7:

Peter Amstutz wrote in #note-6:

Sounds good.

Cool, I added a note to this effect in a comment to try to help keep future writing in line with this scope.

If you need to retrieve all of the results for a list, you need to call the list method multiple times with the same search criteria and increasing offset arguments until no more items are returned.

This is slightly wrong, keyset_list_all doesn't use offset, it uses > on order_key (this gives a more stable result if records change mid-paging).

I just took this out, I feel like this rationale is left over from the previous draft and not really helpful given the recommendation to just use keyset_list_all.

Probably also be worth mentioning that while limit sets the maximum page size, and the default is 100, the value of limit is clamped to 1000, so if you expect there could possibly be more than 1000 items, you definitely need to use keyset_list_all.

Done @ 9e44b1581.