Feature #17074
Updated by Peter Amstutz about 1 year ago
Calculating "items_available" in list requests is expensive. For example, on a user cluster, they have a project with over 2 million collections and it takes around 10 seconds to return 50 items. On another user cluster, they have 600,000 projects and the "shared with me" groups list takes over 5 seconds to return 50 items. I will have to make some manual API calls to double check but I believe if we use "count=none" then it will skip having to count all rows and results will be returned much faster (a few hundred ms). Using "offset" for paging is also expensive, because it throws away results. Without knowing the total count, offset paging cannot provide a "navigate to last page" option, as it does not know how many pages there are. Offset paging also produces unexpected results if list order changes during navigation. We want to make the following changes: # all list requests use count=none unless there's some special reason it needs it (for example, displaying the total number of search results) # paging is adjusted to use keyset paging. Results are ordered by [sort column, uuid] and the search query uses something like [sort column >= last seen, uuid > last seen]. # User can navigate to first page, last page, next page, previous page. User cannot navigate to an arbitrary page. # The last seen sort column and uuid should be included in the query part of the URL in the URL bar so that someone can still copy and paste a logical link to the page. h2. NOTE The "contents" endpoint doesn't completely support keyset paging; it works by concatenating tables (manually in ruby code instead of a UNION query) and the order that tables are queried and returned isn't lexically ordered. <pre> klasses = [Group, Job, PipelineInstance, PipelineTemplate, ContainerRequest, Workflow, Collection, Human, Specimen, Trait] </pre> If we remove the deprecated stuff: <pre> klasses = [Group, ContainerRequest, Workflow, Collection] </pre> We get get object ids * j7d0g * xvhdp * 7fd4e * 4zz18 Which means Workflow and Collection are out of order. So getting contents ordered by uuid won't be totally ordered. There's a workaround where the client passes back last_object_class, so we might be able to get away with that.