Arvados: Issueshttps://dev.arvados.org/https://dev.arvados.org/favicon.ico?15576888422024-03-27T16:15:58ZArvados
Redmine Arvados - Task #21633 (New): Reviewhttps://dev.arvados.org/issues/216332024-03-27T16:15:58ZPeter Amstutzpeter.amstutz@curii.comArvados - Task #21619 (In Progress): Review 21617-fed-contenthttps://dev.arvados.org/issues/216192024-03-26T14:10:39ZTom Cleggtom@curii.comArvados - Bug #21618 (New): cloudtest should give up if test instance disappears from listing bef...https://dev.arvados.org/issues/216182024-03-25T16:52:07ZTom Cleggtom@curii.com
<p>Currently, if an instance/image has a problem that causes it to shutdown before responding to a boot probe, cloudtest keeps probing after it disappears, which is clearly futile.</p> Arvados - Bug #21617 (In Progress): Timeout error reading content from collection on a remote clu...https://dev.arvados.org/issues/216172024-03-25T14:43:50ZTom Cleggtom@curii.com
In a 3-way federation with login cluster z1111:
<ul>
<li>a collection stored on z1111 can be read from z2222 (e.g., workbench.z2222/collections/z1111-4zz18-...)</li>
<li>a collection stored on z2222 cannot be read from z1111 (timeout)</li>
<li>a collection stored on z2222 cannot be read from z3333 (timeout)</li>
</ul>
<p>It looks like the intermediate cluster's keepstore process cannot retrieve the list of keep services from the cluster where the data is stored ("failed to validate remote token") -- this auto-retries in the background for a while, then eventually blockReadRemote gives up.</p>
<p>Manual testing, with jutro/tordo/pirca playing the roles of z1111/z2222/z3333, indicates the same problem existed before and after <a class="issue tracker-2 status-2 priority-4 priority-default parent" title="Feature: Keepstore can stream GET and PUT requests using keep-gateway API (In Progress)" href="https://dev.arvados.org/issues/2960">#2960</a> was merged and deployed to tordo.</p> Arvados - Feature #21606 (In Progress): configurable keep-web output buffer to reduce delay betwe...https://dev.arvados.org/issues/216062024-03-19T03:59:41ZTom Cleggtom@curii.com
<p>According to <a class="issue tracker-2 status-5 priority-4 priority-default closed" title="Feature: Go FileSystem / FUSE mount supports block prefetch (Closed)" href="https://dev.arvados.org/issues/18961">#18961</a>, now that <a class="issue tracker-2 status-2 priority-4 priority-default parent" title="Feature: Keepstore can stream GET and PUT requests using keep-gateway API (In Progress)" href="https://dev.arvados.org/issues/2960">#2960</a> has reduced the TTFB for fetching a block, predicting and pre-fetching the next block appears to be more complex than it's worth.</p>
<p>Instead, in a typical scenario where the backend (keepstore→keep-web) bandwidth is faster than the frontend (keep-web→client), keep-web can reduce or eliminate the between-block delay by writing to an asynchronous output buffer. While keep-web is waiting a few milliseconds for the next block to start arriving from the backend, the client continues to receive the data that has accumulated in the output buffer.</p>
<p>The size of the output buffer should be configurable.</p> Arvados - Feature #21599 (New): _inspect/requests endpoint should reveal whether each request is ...https://dev.arvados.org/issues/215992024-03-15T18:45:20ZTom Cleggtom@curii.com
<p>This is a little inconvenient because the queue decision happens lower in the handler stack than the inspector (and we don't want to change that).</p>
<p>We can do something similar to responseLogFieldsContextKey in <a class="source" href="https://dev.arvados.org/projects/arvados/repository/arvados/entry/sdk/go/httpserver/logger.go">source:sdk/go/httpserver/logger.go</a> -- attach an atomic.Value to the request context as it passes through the Inspect handler, then have RequestLimiter Store() queue status there (queue label, time the request was released for processing), and Load() when generating the _inspect/requests report.</p> Arvados - Bug #21598 (In Progress): Local keepstore invoked by crunch-run should never do EmptyTr...https://dev.arvados.org/issues/215982024-03-15T18:32:48ZTom Cleggtom@curii.com
<p>We don't want N compute nodes periodically checking expiry times on all of the trashed blocks on all backend volumes.</p> Arvados - Task #21567 (In Progress): Manually verify all features (read, write, index, pull, tras...https://dev.arvados.org/issues/215672024-02-29T15:19:49ZPeter Amstutzpeter.amstutz@curii.comArvados - Task #21554 (New): Reviewhttps://dev.arvados.org/issues/215542024-02-28T17:03:25ZPeter Amstutzpeter.amstutz@curii.comArvados - Feature #21295 (New): Determine keep performance benchmark targetshttps://dev.arvados.org/issues/212952023-12-13T16:56:55ZPeter Amstutzpeter.amstutz@curii.com
<p>Consult with end users on the size and shape of collections that they want better read performance (throughput and latency).</p>
<p>Come up with different test cases of reading files to measure performance metrics.</p> Arvados - Idea #20693 (New): Design for server side coordination of multiple writers to a collectionhttps://dev.arvados.org/issues/206932023-06-28T18:57:06ZPeter Amstutzpeter.amstutz@curii.com
<p>Background:</p>
<p>Multiple Arvados services (multiple instances of keep-web, arvados-client mount, arv-mount, etc) are trying to write files to the same collection at the same time.</p>
<p>Assume they are adding/removing/changing multiple files but not making changes that directly conflict/contradict one another.</p>
<p>Requirements:</p>
<ul>
<li>If a file is created or modified it won't disappear as a result of an update from another service that didn't know about that file</li>
<li>If there is a single writer, performance impact should be minimal</li>
<li>If there are multiple writers, it is acceptable that one of them may have to wait to avoid conflicts</li>
<li>Can use pessimistic locking to ensure only one client can have a write lock at a time, attempting to open a file for writing that is locked by another should return an error on open</li>
<li>Support WebDAV lock protocol</li>
</ul> Arvados - Task #20444 (New): Reviewhttps://dev.arvados.org/issues/204442023-04-26T16:09:26ZPeter Amstutzpeter.amstutz@curii.comArvados - Feature #12917 (New): Support ?include=container_uuid for container request lists and g...https://dev.arvados.org/issues/129172018-01-05T14:57:37ZBryan Coscabcosca@curii.com
<p>Users should be able to see the exit code or state=[“Queued”, “Locked”, “Running”, “Cancelled” and “Complete”] from the container_request api method, rather than having to go another layer deeper to the container api method.</p>
<a name="Proposed-solution"></a>
<h2 >Proposed solution<a href="#Proposed-solution" class="wiki-anchor">¶</a></h2>
<p>Add a general "include" option to the index (list) method.</p>
<p>include=container includes the contents of the record associated with container_uuid</p>
<p>Can only be used on fields ending in _uuid</p>
<p>A query like this would return:</p>
<p>?include=container</p>
<pre>
{
"items": [
{
"kind": "arvados#container_request",
"uuid": "abc-123",
"container_uuid": "xyz-123"
...
}
],
"includes": [
{
"uuid": "xyz-123",
"state": "Completed",
"exit_code": 0,
...
}
]
}
</pre>
<p>Initially we will only adopt this syntax but initially limiting the implementation to only container requests. This should be implemented on the controller side.</p>
<p>If used for anything other than include=container on container_request it should return an API error.</p>
<p>Discussion points for future:</p>
<ul>
<li>selecting fields in the includes</li>
<li>Permission enforcement. Permission to read a container is based on permission to read the container_request, but this is not true generally. For example, permission to read a container request doesn't grant permission to read output_uuid.</li>
<li>Straightforward to set up a join when the _uuid field points to exactly one record type (only collections, only containers, etc) but more complex when it can point to multiple (owner_uuid, head/tail_uuid on links)</li>
</ul> Arvados - Feature #12430 (New): Crunch2 limit output collection to glob patternshttps://dev.arvados.org/issues/124302017-10-11T13:21:49ZPeter Amstutzpeter.amstutz@curii.com
<p>The current behavior for crunch-run is to upload all files in the output directory. This sometimes results in temporary files being uploaded that are not intended to be part of the output. Propose adding an "output_glob" field which is an array of filenames or glob patterns specifying which files and directories should be uploaded.</p>
<p>Specifically:</p>
<ul>
<li><code>output_glob</code> takes an array of strings.</li>
<li>If empty, fall back to default behavior (capture entire output).</li>
<li>Only basic Unix globs with <code>?</code> and <code>*</code> wildcards only.</li>
<li>The output only includes paths that match at least one pattern in <code>output_glob</code>.</li>
<li>Patterns match both files and directories.</li>
<li>Directory match means capture the directory and everything inside it.</li>
<li>Pattern can include slashes to capture items in subdirectories. This means parent directories in the path are included in output but should only contain pattern matched items</li>
<li>Items are captured in place, this feature does not include rearranging files.</li>
<li><code>output_glob</code> affects container reuse. output_glob must match for container reuse. Although, if we wanted to be clever, we could reuse containers where the output_glob pattern is a superset of the output_glob that we are asking for (maybe a simple version like empty <code>[]</code> for default behavior, or matches all <code>["*"]</code>).</li>
</ul>
<p>This feature should work for local output directory (by controlling which files are uploaded) and for the temporary collection directory (by controlling which files are propagated to the final collection). The output_glob should also apply when deciding whether to include items pre-populated in the output directory that are specified in 'mounts'.</p>
<p>I'm pretty sure we don't support updating an existing collection in "mounts" so we don't have to worry about that. Crunch always creates a new collection as output. We should confirm/test for that.</p>
<p>Examples:</p>
<p>Directory listing:</p>
<p>foo<br />bar <br />baz/quux<br />baz/parent1/item1</p>
<p>output_glob: ["foo"]<br />Captures:<br />foo</p>
<p>output_glob: ["f*"]<br />Captures:<br />foo</p>
<p>output_glob: ["f*", "b*"]<br />Captures:<br />foo<br />bar<br />baz/quux<br />baz/parent1/item1</p>
<p>output_glob: ["ba?"]<br />Captures:<br />bar<br />baz/quux<br />baz/parent1/item1</p>
<p>output_glob: ["ba*"]<br />Captures:<br />bar<br />baz/quux<br />baz/parent1/item1</p>
<p>output_glob: ["baz"]<br />Captures:<br />baz/quux<br />baz/parent1/item1</p>
<p>output_glob: ["baz/*"]<br />Captures:<br />baz/quux<br />baz/parent1/item1</p>
<p>output_glob: ["baz/parent1"]<br />Captures:<br />baz/parent1/item1</p>
<p>output_glob: ["baz/p*"]<br />Captures:<br />baz/parent1/item1</p>
<p>output_glob: ["baz/parent1/item1"]<br />Captures:<br />baz/parent1/item1</p>
<p>output_glob: ["quux"]<br />Captures:</p>
<p>output_glob: ["*/quux"]<br />Captures:<br />baz/quux</p> Arvados - Feature #2960 (In Progress): Keepstore can stream GET and PUT requests using keep-gatew...https://dev.arvados.org/issues/29602014-06-04T10:48:26ZTim Piercetwp@curoverse.com