https://dev.arvados.org/https://dev.arvados.org/favicon.ico?15576888422018-06-13T17:03:58ZArvadosArvados - Feature #13619: [Controller] Federated multi object retrievalhttps://dev.arvados.org/issues/13619?journal_id=634632018-06-13T17:03:58ZTom Morristfmorris@veritasgenetics.com
<ul><li><strong>Target version</strong> set to <i>To Be Groomed</i></li></ul> Arvados - Feature #13619: [Controller] Federated multi object retrievalhttps://dev.arvados.org/issues/13619?journal_id=660302018-08-21T19:42:12ZPeter Amstutzpeter.amstutz@curii.com
<ul><li><strong>Subject</strong> changed from <i>[Controller] Federated listing API</i> to <i>[Controller] Federated multi object retrieval</i></li><li><strong>Description</strong> updated (<a title="View differences" href="/journals/66030/diff?detail_id=62949">diff</a>)</li></ul> Arvados - Feature #13619: [Controller] Federated multi object retrievalhttps://dev.arvados.org/issues/13619?journal_id=660372018-08-21T21:06:43ZPeter Amstutzpeter.amstutz@curii.com
<ul><li><strong>Description</strong> updated (<a title="View differences" href="/journals/66037/diff?detail_id=62959">diff</a>)</li><li><strong>Story points</strong> set to <i>3.0</i></li></ul> Arvados - Feature #13619: [Controller] Federated multi object retrievalhttps://dev.arvados.org/issues/13619?journal_id=660382018-08-21T21:06:51ZPeter Amstutzpeter.amstutz@curii.com
<ul><li><strong>Target version</strong> changed from <i>To Be Groomed</i> to <i>Arvados Future Sprints</i></li></ul> Arvados - Feature #13619: [Controller] Federated multi object retrievalhttps://dev.arvados.org/issues/13619?journal_id=666502018-09-11T21:07:41ZPeter Amstutzpeter.amstutz@curii.com
<ul><li><strong>Related to</strong> <i><a class="issue tracker-2 status-3 priority-4 priority-default closed parent" href="/issues/14197">Feature #14197</a>: [controller] Federated container requests</i> added</li></ul> Arvados - Feature #13619: [Controller] Federated multi object retrievalhttps://dev.arvados.org/issues/13619?journal_id=669802018-09-19T15:28:10ZTom Morristfmorris@veritasgenetics.com
<ul><li><strong>Target version</strong> changed from <i>Arvados Future Sprints</i> to <i>2018-10-03 Sprint</i></li></ul> Arvados - Feature #13619: [Controller] Federated multi object retrievalhttps://dev.arvados.org/issues/13619?journal_id=669822018-09-19T15:29:42ZPeter Amstutzpeter.amstutz@curii.com
<ul><li><strong>Assigned To</strong> set to <i>Peter Amstutz</i></li></ul> Arvados - Feature #13619: [Controller] Federated multi object retrievalhttps://dev.arvados.org/issues/13619?journal_id=672202018-09-27T21:29:56ZPeter Amstutzpeter.amstutz@curii.com
<ul></ul><p>13619-fed-object-list @ <a class="changeset" title="13619: Test that "select" is passed through multi-object query Arvados-DCO-1.1-Signed-off-by: Pe..." href="https://dev.arvados.org/projects/arvados/repository/arvados/revisions/395e42cc2eb11b52ab5e36b29421edcdea3a3dbf">395e42cc2eb11b52ab5e36b29421edcdea3a3dbf</a></p>
<p><a class="external" href="https://ci.curoverse.com/view/Developer/job/developer-run-tests/907/">https://ci.curoverse.com/view/Developer/job/developer-run-tests/907/</a></p>
<p>Mostly done, but on re-reading the description, there are some details with paging that I haven't addressed.</p> Arvados - Feature #13619: [Controller] Federated multi object retrievalhttps://dev.arvados.org/issues/13619?journal_id=672392018-09-28T20:14:24ZPeter Amstutzpeter.amstutz@curii.com
<ul></ul><p>13619-fed-object-list @ <a class="changeset" title="13619: Document availability of federation features Arvados-DCO-1.1-Signed-off-by: Peter Amstutz..." href="https://dev.arvados.org/projects/arvados/repository/arvados/revisions/8ba7223586eabdbc2bcc2cf8cc143ce624286e50">8ba7223586eabdbc2bcc2cf8cc143ce624286e50</a></p>
<ul>
<li>Read in filters ["uuid", "in", [...]] and ["uuid", "=", "..."] and split them up based on cluster prefix</li>
<li>If more than one cluster prefix is listed, engages the multi-cluster query code</li>
<li>Check if filter size exceeds max page size (requires new config.yml configuration parameter) (with test)</li>
<li>Error if limit, offset, or order is provided (with tests)</li>
<li>Fails the whole query on error (with tests)</li>
<li>Loop and request missing uuids if previous query returned results and not all results have been returned</li>
<li>Implemented generically, supports containers, container requests and workflows.</li>
<li>Updated documentation</li>
</ul>
<p><a class="external" href="https://ci.curoverse.com/view/Developer/job/developer-run-tests/908/">https://ci.curoverse.com/view/Developer/job/developer-run-tests/908/</a></p> Arvados - Feature #13619: [Controller] Federated multi object retrievalhttps://dev.arvados.org/issues/13619?journal_id=672532018-10-01T13:20:24ZPeter Amstutzpeter.amstutz@curii.com
<ul><li><strong>Status</strong> changed from <i>New</i> to <i>In Progress</i></li></ul> Arvados - Feature #13619: [Controller] Federated multi object retrievalhttps://dev.arvados.org/issues/13619?journal_id=672622018-10-01T15:20:16ZTom Cleggtom@curii.com
<ul></ul><ul>
<li>In docs, remove some extra "must" in the "list method must:" items (and I think that should be "request must", not "method must"?</li>
<li>(*multiClusterQueryResponseCollector)collectResponse() should un-pyramid the non-error case, do "if err { return }" instead</li>
<li>(*multiClusterQueryResponseCollector)collectResponse() does way too much type-casting. Should check response status first, then decode into an appropriate struct type. E.g., if an element of items doesn't decode to a map[string]interface{} then we might as well fail at the decoding stage instead of ignoring this later in remoteQueryUUIDs().</li>
<li>(*genericFederatedRequestHandler)remoteQueryUUIDs() will not terminate in various cases (remote keeps returning the same uuids, or entries without a "uuid" key) -- could fix by stopping the loop when found doesn't grow, rather than when the remote returns nothing</li>
<li>ParallelRemoteRequests should be Concurrent... and I wonder if this should be a server-wide limit, or renamed to indicate it's a per-incoming-request limit?</li>
<li>Instead of url.ParseQuery(req.URL.RawQuery) and loadParamsFromForm, could we use stdlib (*http.Request).ParseForm()?</li>
<li>loadParamsFromJson appears to be unused</li>
<li>Prefer params.Get("foo") != "" over <code>if len(params["foo"])==1 { ... params["foo"][0] }</code> unless something like "?foo" is supposed to trigger the non-empty case
<ul>
<li>"if len(params["select"]) == 1" treats <code>?select=foo&select=bar</code> as if no select were given at all -- intentional?</li>
</ul>
</li>
<li>Prefer remoteParams.Set("_method", "GET") over remoteParams["_method"] = []string{"GET"}
<ul>
<li>...or when initializing, might as well use remoteParams := url.Values{"_method": {"GET"}, "count": {"none"}}</li>
</ul>
</li>
<li>In (*genericFederatedRequestHandler) handleMultiClusterQuery(), r.(string) == "uuid" crashes if r isn't a string; maybe <code>selects</code> should be <code>[]string</code> instead of <code>[]interface{}</code>?</li>
<li>In (*genericFederatedRequestHandler)handleMultiClusterQuery(), "for _, f1 := range filters" -- what's the "1" referring to in "f1"? Could it be called "filter"?</li>
<li>In (*genericFederatedRequestHandler)handleMultiClusterQuery(), it looks like filters=<a class="wiki-page new" href="https://dev.arvados.org/projects/arvados/wiki/%22uuid%22%22in%22%5B1">"uuid","in",[1</a> would crash</li>
<li>In (*genericFederatedRequestHandler)handleMultiClusterQuery(), should un-pyramid the <code>if ok && lhs == "uuid"</code> block -- just needs "if col, ok := f1<sup><a href="#fn0">0</a></sup>.(string); !ok || col != "uuid" { return false }" </li>
<li>Default req concurrency should be 4 if not given in config file, not 0 (deadlock?)</li>
<li>Default page size should be 1000 if not given in config file, not 0</li>
</ul> Arvados - Feature #13619: [Controller] Federated multi object retrievalhttps://dev.arvados.org/issues/13619?journal_id=672702018-10-01T19:50:32ZPeter Amstutzpeter.amstutz@curii.com
<ul></ul><p>Tom Clegg wrote:</p>
<blockquote>
<ul>
<li>In docs, remove some extra "must" in the "list method must:" items (and I think that should be "request must", not "method must"?</li>
</ul>
</blockquote>
<p>Fixed.</p>
<blockquote>
<ul>
<li>(*multiClusterQueryResponseCollector)collectResponse() should un-pyramid the non-error case, do "if err { return }" instead</li>
</ul>
</blockquote>
<p>Fixed.</p>
<blockquote>
<ul>
<li>(*multiClusterQueryResponseCollector)collectResponse() does way too much type-casting. Should check response status first, then decode into an appropriate struct type. E.g., if an element of items doesn't decode to a map[string]interface{} then we might as well fail at the decoding stage instead of ignoring this later in remoteQueryUUIDs().</li>
</ul>
</blockquote>
<p>Fixed.</p>
<blockquote>
<ul>
<li>(*genericFederatedRequestHandler)remoteQueryUUIDs() will not terminate in various cases (remote keeps returning the same uuids, or entries without a "uuid" key) -- could fix by stopping the loop when found doesn't grow, rather than when the remote returns nothing</li>
</ul>
</blockquote>
<p>Fixed.</p>
<blockquote>
<ul>
<li>ParallelRemoteRequests should be Concurrent... and I wonder if this should be a server-wide limit, or renamed to indicate it's a per-incoming-request limit?</li>
</ul>
</blockquote>
<p>Renamed to FederatedRequestConcurrency</p>
<blockquote>
<ul>
<li>Instead of url.ParseQuery(req.URL.RawQuery) and loadParamsFromForm, could we use stdlib (*http.Request).ParseForm()?</li>
</ul>
</blockquote>
<p>Done, the tricky bit is buffering the body for the cases where you are going to proxy the request and need to read the body a second time.</p>
<blockquote>
<ul>
<li>loadParamsFromJson appears to be unused</li>
</ul>
</blockquote>
<p>In a previous branch I had it looking in the json payload for cluster_id, but on further consideration I'm fairly certain none of our SDKs put in there (it goes in the URL query portion) so I removed the unnecessary complexity.</p>
<blockquote>
<ul>
<li>Prefer params.Get("foo") != "" over <code>if len(params["foo"])==1 { ... params["foo"][0] }</code> unless something like "?foo" is supposed to trigger the non-empty case</li>
</ul>
</blockquote>
<p>Fixed.</p>
<blockquote>
<ul>
<li>"if len(params["select"]) == 1" treats <code>?select=foo&select=bar</code> as if no select were given at all -- intentional?</li>
</ul>
</blockquote>
<p>Does the API server do the right thing if you supply that? eg does rails turn that into a list?</p>
<blockquote>
<ul>
<li>Prefer remoteParams.Set("_method", "GET") over remoteParams["_method"] = []string{"GET"}
<ul>
<li>...or when initializing, might as well use remoteParams := url.Values{"_method": {"GET"}, "count": {"none"}}</li>
</ul></li>
</ul>
</blockquote>
<p>Fixed.</p>
<blockquote>
<ul>
<li>In (*genericFederatedRequestHandler) handleMultiClusterQuery(), r.(string) == "uuid" crashes if r isn't a string; maybe <code>selects</code> should be <code>[]string</code> instead of <code>[]interface{}</code>?</li>
</ul>
</blockquote>
<p>Fixed.</p>
<blockquote>
<ul>
<li>In (*genericFederatedRequestHandler)handleMultiClusterQuery(), "for _, f1 := range filters" -- what's the "1" referring to in "f1"? Could it be called "filter"?</li>
</ul>
</blockquote>
<p>Fixed.</p>
<blockquote>
<ul>
<li>In (*genericFederatedRequestHandler)handleMultiClusterQuery(), it looks like filters=<a class="wiki-page new" href="https://dev.arvados.org/projects/arvados/wiki/%22uuid%22%22in%22%5B1">"uuid","in",[1</a> would crash</li>
</ul>
</blockquote>
<p>Fixed.</p>
<blockquote>
<ul>
<li>In (*genericFederatedRequestHandler)handleMultiClusterQuery(), should un-pyramid the <code>if ok && lhs == "uuid"</code> block -- just needs "if col, ok := f1<sup><a href="#fn0">0</a></sup>.(string); !ok || col != "uuid" { return false }"</li>
</ul>
</blockquote>
<p>Fixed.</p>
<blockquote>
<ul>
<li>Default req concurrency should be 4 if not given in config file, not 0 (deadlock?)</li>
</ul>
</blockquote>
<p>Fixed.</p>
<blockquote>
<ul>
<li>Default page size should be 1000 if not given in config file, not 0</li>
</ul>
</blockquote>
<p>Fixed.</p>
<p>13619-fed-object-list @ <a class="changeset" title="13619: Code cleanups Arvados-DCO-1.1-Signed-off-by: Peter Amstutz <pamstutz@veritasgenetics.com>" href="https://dev.arvados.org/projects/arvados/repository/arvados/revisions/cdaca3963cf2d07e81eb71ac33e4c0966dec9b93">cdaca3963cf2d07e81eb71ac33e4c0966dec9b93</a></p>
<p><a class="external" href="https://ci.curoverse.com/view/Developer/job/developer-run-tests/910">https://ci.curoverse.com/view/Developer/job/developer-run-tests/910</a></p> Arvados - Feature #13619: [Controller] Federated multi object retrievalhttps://dev.arvados.org/issues/13619?journal_id=673082018-10-02T13:56:47ZTom Cleggtom@curii.com
<ul></ul><p>Peter Amstutz wrote:</p>
<blockquote>
<p>Tom Clegg wrote:</p>
<blockquote>
<ul>
<li>ParallelRemoteRequests should be Concurrent... and I wonder if this should be a server-wide limit, or renamed to indicate it's a per-incoming-request limit?</li>
</ul>
</blockquote>
<p>Renamed to FederatedRequestConcurrency</p>
</blockquote>
<p>I wonder if Federated → MultiCluster would help suggest that it applies to an individual request?</p>
<p>Can we move these to a Limits section like <a class="wiki-page" href="https://dev.arvados.org/projects/arvados/wiki/Cluster_configuration">Cluster configuration</a>? I'm not sure "Limits" is the perfect category name but putting them at the top level doesn't seem like a great way to start. RequestLimits?</p>
<blockquote><blockquote>
<ul>
<li>"if len(params["select"]) == 1" treats <code>?select=foo&select=bar</code> as if no select were given at all -- intentional?</li>
</ul>
</blockquote>
<p>Does the API server do the right thing if you supply that? eg does rails turn that into a list?</p>
</blockquote>
<p>It seems to take the last one. (url.Values)Get() takes the first.</p>
<p>The way it is is fine for now. But longer term, I think we should be parsing the incoming request using the same (hopefully stdlib-based) approach across the board, then build new outgoing request(s) if needed. That way we would be sending the same "select" param value we're basing our response assumptions on, for example.</p> Arvados - Feature #13619: [Controller] Federated multi object retrievalhttps://dev.arvados.org/issues/13619?journal_id=673102018-10-02T15:41:05ZTom Cleggtom@curii.com
<ul></ul><p>Peter Amstutz wrote:</p>
<blockquote>
<p>Tom Clegg wrote:</p>
<blockquote>
<ul>
<li>Default req concurrency should be 4 if not given in config file, not 0 (deadlock?)</li>
</ul>
</blockquote>
<p>Fixed.</p>
<blockquote>
<ul>
<li>Default page size should be 1000 if not given in config file, not 0</li>
</ul>
</blockquote>
<p>Fixed.</p>
</blockquote>
<p>Are these exported on purpose? (If we're going to export defaults, I think it would be better to do it through the Cluster config object itself, not controller.Handler.)</p>
<pre><code>func (h *Handler) FederatedRequestConcurrency() int<br />func (h *Handler) MaxItemsPerResponse() int</code></pre>
<p>In doc/api/methods/containers.html.textile.liquid I don't think we should claim to support federated create/update -- even if we do forward the API calls, there isn't really a use case.</p> Arvados - Feature #13619: [Controller] Federated multi object retrievalhttps://dev.arvados.org/issues/13619?journal_id=673272018-10-02T20:53:28ZPeter Amstutzpeter.amstutz@curii.com
<ul></ul><p>Tom Clegg wrote:</p>
<blockquote>
<p>I wonder if Federated → MultiCluster would help suggest that it applies to an individual request?</p>
<p>Can we move these to a Limits section like <a class="wiki-page" href="https://dev.arvados.org/projects/arvados/wiki/Cluster_configuration">Cluster configuration</a>? I'm not sure "Limits" is the perfect category name but putting them at the top level doesn't seem like a great way to start. RequestLimits?</p>
</blockquote>
<p>Done</p>
<blockquote>
<p>Are these exported on purpose? (If we're going to export defaults, I think it would be better to do it through the Cluster config object itself, not controller.Handler.)</p>
</blockquote>
<p>Added methods to arvados.Cluster.RequestLimits</p>
<blockquote>
<p>In doc/api/methods/containers.html.textile.liquid I don't think we should claim to support federated create/update -- even if we do forward the API calls, there isn't really a use case.</p>
</blockquote>
<p>Right, fixed.</p>
<p>13619-fed-object-list @ <a class="changeset" title="13619: Move configuration options to arvados.Cluster.RequestLimits Arvados-DCO-1.1-Signed-off-by..." href="https://dev.arvados.org/projects/arvados/repository/arvados/revisions/cd52bc7ce67cb315e9d6175df4c750e190b9207e">cd52bc7ce67cb315e9d6175df4c750e190b9207e</a></p>
<p><a class="external" href="https://ci.curoverse.com/view/Developer/job/developer-run-tests/914/">https://ci.curoverse.com/view/Developer/job/developer-run-tests/914/</a></p> Arvados - Feature #13619: [Controller] Federated multi object retrievalhttps://dev.arvados.org/issues/13619?journal_id=673832018-10-03T13:22:01ZTom Cleggtom@curii.com
<ul></ul><p>LGTM, thanks</p> Arvados - Feature #13619: [Controller] Federated multi object retrievalhttps://dev.arvados.org/issues/13619?journal_id=674012018-10-03T15:23:09ZPeter Amstutzpeter.amstutz@curii.com
<ul><li><strong>Status</strong> changed from <i>In Progress</i> to <i>Resolved</i></li></ul> Arvados - Feature #13619: [Controller] Federated multi object retrievalhttps://dev.arvados.org/issues/13619?journal_id=686422018-11-13T20:49:35ZTom Morristfmorris@veritasgenetics.com
<ul><li><strong>Release</strong> set to <i>14</i></li></ul>