Arvados: Issueshttps://dev.arvados.org/https://dev.arvados.org/favicon.ico?15576888422024-03-04T20:31:38ZArvados
Redmine Arvados - Feature #21572 (New): Reorganize doc/user/getting_started/setup-cli.html to prioritize ...https://dev.arvados.org/issues/215722024-03-04T20:31:38ZBrett Smithbrett.smith@curii.com
<p>User requested that this page just have a block of commands they can copy and paste to set everything up. I don't know if we can get 100% there right now, but we could probably get a lot closer, and good enough, with reasonable effort. Right now the page expects you to read a separate install page for each tool, which is a lot of hoop-jumping. Planned sections:</p>
<ol>
<li>Install with Debian/Ubuntu packages: Set up the apt repository, update, install all the client tool packages</li>
<li>Install with RHEL family packages: Set up the yum repository, update, install all the client tool packages</li>
<li>Install without privileges: Probably break this up into subsections to go from what most people will find easiest vs. hardest. Explain what you get at each step and let users know they can jump off whenever.
<ol>
<li>The Python tools: run <code>python3 -m venv</code>, <code>pip install arvados-cwl-runner arvados_fuse crunchstat_summary</code></li>
<li>If <a class="issue tracker-6 status-1 priority-4 priority-default" title="Idea: Publish standalone binaries for arvados-client, other Go client tools (New)" href="https://dev.arvados.org/issues/20727">#20727</a> is done: Download <code>arvados-client</code> to the right place</li>
<li>The Ruby tools: <code>gem install arvados-cli</code>—I'm less sure what's an appropriate incantation for this</li>
</ol></li>
</ol>
<p>The page can then link to separate pages for individual tools, for more information or for people who want to be more selective for whatever reason.</p> Arvados - Bug #21571 (New): Documentation should call it "arv-mount" rather than "FUSE Driver"https://dev.arvados.org/issues/215712024-03-04T17:09:03ZBrett Smithbrett.smith@curii.com
<p>"FUSE Driver" is a meaningless name to people who don't know what "FUSE" is, which is most people. The documentation should refer to the tool as "arv-mount" as much as possible, since that's a distinctive tool name and more people understand what a "mount" is generally (not a ton more, but still). If necessary the documentation can explain that arv-mount is implemented using FUSE, but that shouldn't be an identifier.</p> Arvados - Bug #21547 (New): return certain database errors as 500 so they can be retriedhttps://dev.arvados.org/issues/215472024-02-27T19:19:14ZPeter Amstutzpeter.amstutz@curii.com
<p>Certain database errors represent transient errors. We should tell the client to retry the request by returning a 500 internal server error instead of 422 (which is the default behavior).</p>
<p>#<ActiveRecord::Deadlocked: PG::TRDeadlockDetected: ERROR: deadlock detected></p>
<p>Rationale: The observed deadlocks in Arvados are conflicts between two statements (a lock ordering issue), so unwinding and retrying is a reasonable solution</p>
<p>#<ActiveRecord::StatementInvalid: PG::UnableToSend></p>
<p>Rationale: It seems this gets thrown when the API server can't connect to the database.</p>
<p>Here's the list of postgres errors known to the PG gem:</p>
<p><a class="external" href="https://github.com/ged/ruby-pg/blob/daec80f91b9519509ca1694a231f11a75cb43f7f/ext/errorcodes.def#L598">https://github.com/ged/ruby-pg/blob/daec80f91b9519509ca1694a231f11a75cb43f7f/ext/errorcodes.def#L598</a></p>
<p><a class="external" href="https://github.com/ged/ruby-pg/blob/daec80f91b9519509ca1694a231f11a75cb43f7f/ext/pg_errors.c#L88">https://github.com/ged/ruby-pg/blob/daec80f91b9519509ca1694a231f11a75cb43f7f/ext/pg_errors.c#L88</a></p>
<p>Some other possible Exceptions to retry:</p>
<p>ConnectionBad<br />ConnectionException<br />ConnectionDoesNotExist<br />ConnectionFailure<br />TooManyConnections<br />CannotConnectNow<br />IdleSessionTimeout<br />ObjectInUse<br />LockNotAvailable<br />AdminShutdown<br />CrashShutdown</p>
<p>(There's a lot of connection related errors and I don't know the difference between them, but I included them all because it seems like those are very likely to be errors that occur through no fault of the client).</p> Arvados - Bug #21524 (New): test-provision-ubuntu2004 intermittently times out waiting for the co...https://dev.arvados.org/issues/215242024-02-15T14:26:08ZBrett Smithbrett.smith@curii.com
<p>This has happened repeatedly.</p>
<pre> Name: arvados-controller - Function: pkg.installed - Result: Changed Started: - 22:53:27.606297 Duration: 10607.743 ms
Name: arvados-controller - Function: service.running - Result: Changed Started: - 22:53:38.277567 Duration: 1113.901 ms
----------
ID: arvados-controller-service-running-service-ready-cmd-run
Function: cmd.run
Name: while ! (curl -k -s https://ubu20.local:8800 | \
grep -qE "req-[a-z0-9]{20}.{5}error_token") do
echo 'waiting for API to be ready...'
sleep 1
done
Result: False
Comment: Command "while ! (curl -k -s https://ubu20.local:8800 | \
grep -qE "req-[a-z0-9]{20}.{5}error_token") do
echo 'waiting for API to be ready...'
sleep 1
done
" run
Started: 22:53:39.397300
Duration: 120828.503 ms
Changes:
----------
pid:
37097
retcode:
1
stderr:
stdout:
while ! (curl -k -s https://ubu20.local:8800 | \
grep -qE "req-[a-z0-9]{20}.{5}error_token") do
echo 'waiting for API to be ready...'
sleep 1
done
: Timed out after 120 seconds
Name: nginx - Function: service.mod_watch - Result: Changed Started: - 22:55:40.233381 Duration: 63.787 ms
Summary for local
--------------
Succeeded: 144 (changed=106)
Failed: 1
--------------
</pre> Arvados - Bug #21521 (New): Uploading debian12 packages fails intermittentlyhttps://dev.arvados.org/issues/215212024-02-14T18:31:47ZBrett Smithbrett.smith@curii.com
<p>Because Jenkins hates me personally, occasionally build-packages-debian12 fails on the upload step:</p>
<pre>======= Start upload packages
/usr/local/arvados-dev/jenkins/run_upload_packages.py --repo dev -H jenkinsapt@apt.arvados.org -o Port=2222 --workspace /tmp/workspace/build-packages-debian12 debian12
Unable to open database, sleeping 8.753268149s, attempts left 10...
Unable to open database, sleeping 9.298699287s, attempts left 9...
Unable to open database, sleeping 8.465308774s, attempts left 8...
Unable to open database, sleeping 9.709491264s, attempts left 7...
Unable to open database, sleeping 8.100403568s, attempts left 6...
Unable to open database, sleeping 11.044444422s, attempts left 5...
Unable to open database, sleeping 11.153889278s, attempts left 4...
Unable to open database, sleeping 8.410966832s, attempts left 3...
Unable to open database, sleeping 10.525453171s, attempts left 2...
Unable to open database, sleeping 11.82935613s, attempts left 1...
ERROR: unable to reopen the DB, maximum number of retries reached
Traceback (most recent call last):
File "/usr/local/arvados-dev/jenkins/run_upload_packages.py", line 362, in <module>
main(sys.argv[1:])
File "/usr/local/arvados-dev/jenkins/run_upload_packages.py", line 358, in main
build_suite_and_upload(target, last_upload_ts, args)
File "/usr/local/arvados-dev/jenkins/run_upload_packages.py", line 348, in build_suite_and_upload
suite.update_packages(since_timestamp)
File "/usr/local/arvados-dev/jenkins/run_upload_packages.py", line 115, in update_packages
self.post_uploads(upload_paths)
File "/usr/local/arvados-dev/jenkins/run_upload_packages.py", line 237, in post_uploads
self._run_script(self.APT_SCRIPT, self.REMOTE_DEST_DIR + '/' + self.target,
File "/usr/local/arvados-dev/jenkins/run_upload_packages.py", line 193, in _run_script
subprocess.check_call(self._build_cmd(
File "/usr/lib/python3.9/subprocess.py", line 373, in check_call
raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['ssh', '-oPort=2222', '-q', 'jenkinsapt@apt.arvados.org', 'bash', '-ec', '\'\ncd "$1"; shift\nDISTNAME=$1; shift\nfor package in "$@"; do\n set +e\n aptly repo search "$DISTNAME" "${package%.deb}" >/dev/null 2>&1\n RET=$?\n set -e\n if [[ $RET -eq 0 ]]; then\n echo "Not adding $package, it is already present in repo $DISTNAME"\n rm "$package"\n else\n aptly repo add -remove-files "$DISTNAME" "$package"\n fi\ndone\naptly publish update "$DISTNAME" filesystem:"${DISTNAME%-*}":\n\'', 'DebianPackageSuite', 'tmp/debian12', 'bookworm-dev', "'arvados-sync-users_2.8.0~dev20240214163440-1_amd64.deb'", "'crunch-dispatch-slurm_2.8.0~dev20240214163440-1_amd64.deb'", "'python3-crunchstat-summary_2.8.0~dev20240213172635-1_amd64.deb'", "'keep-web_2.8.0~dev20240214163440-1_amd64.deb'", "'arvados-sync-groups_2.8.0~dev20240214163440-1_arm64.deb'", "'keepproxy_2.8.0~dev20240214163440-1_arm64.deb'", "'crunch-run_2.8.0~dev20240214163440-1_amd64.deb'", "'keep-exercise_2.8.0~dev20240214163440-1_arm64.deb'", "'python3-arvados-user-activity_2.8.0~dev20240213172635-1_amd64.deb'", "'crunch-dispatch-local_2.8.0~dev20240214163440-1_amd64.deb'", "'arvados-ws_2.8.0~dev20240214163440-1_arm64.deb'", "'keepstore_2.8.0~dev20240214163440-1_amd64.deb'", "'arvados-sync-users_2.8.0~dev20240214163440-1_arm64.deb'", "'arvados-git-httpd_2.8.0~dev20240214163440-1_amd64.deb'", "'crunch-dispatch-local_2.8.0~dev20240214163440-1_arm64.deb'", "'arvados-dispatch-lsf_2.8.0~dev20240214163440-1_amd64.deb'", "'arvados-controller_2.8.0~dev20240214163440-1_amd64.deb'", "'arvados-client_2.8.0~dev20240214163440-1_amd64.deb'", "'libpam-arvados-go_2.8.0~dev20240214163440-1_amd64.deb'", "'keep-block-check_2.8.0~dev20240214163440-1_arm64.deb'", "'arvados-client_2.8.0~dev20240214163440-1_arm64.deb'", "'arvados-ws_2.8.0~dev20240214163440-1_amd64.deb'", "'arvados-server_2.8.0~dev20240214163440-1_arm64.deb'", "'arvados-health_2.8.0~dev20240214163440-1_arm64.deb'", "'arvados-src_2.8.0~dev20240214163440-1_all.deb'", "'arvados-docker-cleaner_2.8.0~dev20240207214436-1_amd64.deb'", "'keep-balance_2.8.0~dev20240214163440-1_arm64.deb'", "'keepstore_2.8.0~dev20240214163440-1_arm64.deb'", "'keep-rsync_2.8.0~dev20240214163440-1_arm64.deb'", "'arvados-sync-groups_2.8.0~dev20240214163440-1_amd64.deb'", "'arvados-dispatch-lsf_2.8.0~dev20240214163440-1_arm64.deb'", "'keep-block-check_2.8.0~dev20240214163440-1_amd64.deb'", "'arvados-dispatch-cloud_2.8.0~dev20240214163440-1_arm64.deb'", "'crunch-run_2.8.0~dev20240214163440-1_arm64.deb'", "'arvados-git-httpd_2.8.0~dev20240214163440-1_arm64.deb'", "'keep-balance_2.8.0~dev20240214163440-1_amd64.deb'", "'arvados-health_2.8.0~dev20240214163440-1_amd64.deb'", "'keep-rsync_2.8.0~dev20240214163440-1_amd64.deb'", "'python3-arvados-fuse_2.8.0~dev20240213172635-1_amd64.deb'", "'arvados-workbench2_2.8.0~dev20240214163440-1_amd64.deb'", "'keep-exercise_2.8.0~dev20240214163440-1_amd64.deb'", 'python3-cwltest_2.3.20230108193615-1_amd64.deb', "'arvados-dispatch-cloud_2.8.0~dev20240214163440-1_amd64.deb'", "'keepproxy_2.8.0~dev20240214163440-1_amd64.deb'", "'python3-arvados-python-client_2.8.0~dev20240213172635-1_amd64.deb'", "'arvados-api-server_2.8.0~dev20240214163440-1_amd64.deb'", "'crunch-dispatch-slurm_2.8.0~dev20240214163440-1_arm64.deb'", "'arvados-controller_2.8.0~dev20240214163440-1_arm64.deb'", "'arvados-server_2.8.0~dev20240214163440-1_amd64.deb'", "'python3-arvados-cwl-runner_2.8.0~dev20240213172635-1_amd64.deb'", "'keep-web_2.8.0~dev20240214163440-1_arm64.deb'"]' returned non-zero exit status 1.
======= upload packages -- FAILED
======= End of upload packages (200s)
</pre>
<p>Those "Unable to open database" errors are from aptly. I don't understand what seems to be special about debian12 that it seems more prone to these database timeouts, but it definitely happens too often to be just bad luck.</p>
<p>The silver lining here is at least all this is infrastructure under our control so we can presumably fix it once we figure out what causes the issue.</p> Arvados - Bug #21434 (New): Follow up fix to schema saladhttps://dev.arvados.org/issues/214342024-01-31T16:46:46ZPeter Amstutzpeter.amstutz@curii.com
<p><a class="external" href="https://github.com/common-workflow-language/schema_salad/issues/766">https://github.com/common-workflow-language/schema_salad/issues/766</a></p> Arvados - Feature #21295 (New): Determine keep performance benchmark targetshttps://dev.arvados.org/issues/212952023-12-13T16:56:55ZPeter Amstutzpeter.amstutz@curii.com
<p>Consult with end users on the size and shape of collections that they want better read performance (throughput and latency).</p>
<p>Come up with different test cases of reading files to measure performance metrics.</p> Arvados - Idea #20693 (New): Design for server side coordination of multiple writers to a collectionhttps://dev.arvados.org/issues/206932023-06-28T18:57:06ZPeter Amstutzpeter.amstutz@curii.com
<p>Background:</p>
<p>Multiple Arvados services (multiple instances of keep-web, arvados-client mount, arv-mount, etc) are trying to write files to the same collection at the same time.</p>
<p>Assume they are adding/removing/changing multiple files but not making changes that directly conflict/contradict one another.</p>
<p>Requirements:</p>
<ul>
<li>If a file is created or modified it won't disappear as a result of an update from another service that didn't know about that file</li>
<li>If there is a single writer, performance impact should be minimal</li>
<li>If there are multiple writers, it is acceptable that one of them may have to wait to avoid conflicts</li>
<li>Can use pessimistic locking to ensure only one client can have a write lock at a time, attempting to open a file for writing that is locked by another should return an error on open</li>
<li>Support WebDAV lock protocol</li>
</ul> Arvados - Feature #12430 (New): Crunch2 limit output collection to glob patternshttps://dev.arvados.org/issues/124302017-10-11T13:21:49ZPeter Amstutzpeter.amstutz@curii.com
<p>The current behavior for crunch-run is to upload all files in the output directory. This sometimes results in temporary files being uploaded that are not intended to be part of the output. Propose adding an "output_glob" field which is an array of filenames or glob patterns specifying which files and directories should be uploaded.</p>
<p>Specifically:</p>
<ul>
<li><code>output_glob</code> takes an array of strings.</li>
<li>If empty, fall back to default behavior (capture entire output).</li>
<li>Only basic Unix globs with <code>?</code> and <code>*</code> wildcards only.</li>
<li>The output only includes paths that match at least one pattern in <code>output_glob</code>.</li>
<li>Patterns match both files and directories.</li>
<li>Directory match means capture the directory and everything inside it.</li>
<li>Pattern can include slashes to capture items in subdirectories. This means parent directories in the path are included in output but should only contain pattern matched items</li>
<li>Items are captured in place, this feature does not include rearranging files.</li>
<li><code>output_glob</code> affects container reuse. output_glob must match for container reuse. Although, if we wanted to be clever, we could reuse containers where the output_glob pattern is a superset of the output_glob that we are asking for (maybe a simple version like empty <code>[]</code> for default behavior, or matches all <code>["*"]</code>).</li>
</ul>
<p>This feature should work for local output directory (by controlling which files are uploaded) and for the temporary collection directory (by controlling which files are propagated to the final collection). The output_glob should also apply when deciding whether to include items pre-populated in the output directory that are specified in 'mounts'.</p>
<p>I'm pretty sure we don't support updating an existing collection in "mounts" so we don't have to worry about that. Crunch always creates a new collection as output. We should confirm/test for that.</p>
<p>Examples:</p>
<p>Directory listing:</p>
<p>foo<br />bar <br />baz/quux<br />baz/parent1/item1</p>
<p>output_glob: ["foo"]<br />Captures:<br />foo</p>
<p>output_glob: ["f*"]<br />Captures:<br />foo</p>
<p>output_glob: ["f*", "b*"]<br />Captures:<br />foo<br />bar<br />baz/quux<br />baz/parent1/item1</p>
<p>output_glob: ["ba?"]<br />Captures:<br />bar<br />baz/quux<br />baz/parent1/item1</p>
<p>output_glob: ["ba*"]<br />Captures:<br />bar<br />baz/quux<br />baz/parent1/item1</p>
<p>output_glob: ["baz"]<br />Captures:<br />baz/quux<br />baz/parent1/item1</p>
<p>output_glob: ["baz/*"]<br />Captures:<br />baz/quux<br />baz/parent1/item1</p>
<p>output_glob: ["baz/parent1"]<br />Captures:<br />baz/parent1/item1</p>
<p>output_glob: ["baz/p*"]<br />Captures:<br />baz/parent1/item1</p>
<p>output_glob: ["baz/parent1/item1"]<br />Captures:<br />baz/parent1/item1</p>
<p>output_glob: ["quux"]<br />Captures:</p>
<p>output_glob: ["*/quux"]<br />Captures:<br />baz/quux</p> GET-Evidence - Bug #5883 (New): migrate to updated oauthhttps://dev.arvados.org/issues/58832015-05-01T18:31:33ZAbram Connellyabram.connelly@gmail.com
<p>There is a warning on that says OAuth2 is going away. We need to migrate the authentication on GET-Evidence to make sure login is still possible.</p> GET-Evidence - Feature #5873 (New): GET-Evidence pipeline code should be publichttps://dev.arvados.org/issues/58732015-04-30T22:55:20ZAbram Connellyabram.connelly@gmail.com
<p>The GET-Evidence pipeline should be put under a public git repository.</p>
<p>Currently the GET-Evidence repository is under my (Abram Connelly) private Arvados git repository. This should probably remain but some hooks should be added somewhere so that the repository is pushed to a public Curoverse repository.</p> Arvados - Idea #3304 (New): [Workbench] Add checkboxes to multi-type chooser modal to filter by o...https://dev.arvados.org/issues/33042014-07-18T21:04:07ZTom Cleggtom@curii.comTapestry - Idea #2850 (New): Comprehensive "Sectioning" of Tapestryhttps://dev.arvados.org/issues/28502014-05-22T13:24:56ZPhil Hodgsonphil@curoverse.com
<p>Continuing the work done as proof of concept in Story <a class="issue tracker-6 status-3 priority-4 priority-default closed parent" title="Idea: Make it possible to disable/enable "sections" (Resolved)" href="https://dev.arvados.org/issues/2503">#2503</a>, a comprensive sectioning off of Tapestry will force us to assess what parts of Tapestry are out of use and how the pieces fit together conceptually. The ideal will be a small set of "sections", where each one is optional.</p>
<p>The methodology is:</p>
<ol>
<li>Identify "modules" and assign config.yml keys</li>
<li>Reference configuration when rendering partials and cells</li>
<li>Reference configuration when authorizing in controllers</li>
</ol>
<p>At each point we risk encountering questions of architecture because it will be evident that some reorganization of the code being done at the same time as adding these "sections" would make sense. Therefore this is a non-trivial but important story overall.</p> Tapestry - Idea #2552 (New): Configurable mini-consent form and validationshttps://dev.arvados.org/issues/25522014-04-08T12:13:43ZPhil Hodgsonphil@curoverse.comTapestry - Idea #2544 (New): The "site override" feature is testedhttps://dev.arvados.org/issues/25442014-04-07T13:02:53ZPhil Hodgsonphil@curoverse.com