Arvados: Issueshttps://dev.arvados.org/https://dev.arvados.org/favicon.ico?15576888422024-03-04T17:09:03ZArvados
Redmine Arvados - Bug #21571 (New): Documentation should call it "arv-mount" rather than "FUSE Driver"https://dev.arvados.org/issues/215712024-03-04T17:09:03ZBrett Smithbrett.smith@curii.com
<p>"FUSE Driver" is a meaningless name to people who don't know what "FUSE" is, which is most people. The documentation should refer to the tool as "arv-mount" as much as possible, since that's a distinctive tool name and more people understand what a "mount" is generally (not a ton more, but still). If necessary the documentation can explain that arv-mount is implemented using FUSE, but that shouldn't be an identifier.</p> Arvados - Bug #21547 (New): return certain database errors as 500 so they can be retriedhttps://dev.arvados.org/issues/215472024-02-27T19:19:14ZPeter Amstutzpeter.amstutz@curii.com
<p>Certain database errors represent transient errors. We should tell the client to retry the request by returning a 500 internal server error instead of 422 (which is the default behavior).</p>
<p>#<ActiveRecord::Deadlocked: PG::TRDeadlockDetected: ERROR: deadlock detected></p>
<p>Rationale: The observed deadlocks in Arvados are conflicts between two statements (a lock ordering issue), so unwinding and retrying is a reasonable solution</p>
<p>#<ActiveRecord::StatementInvalid: PG::UnableToSend></p>
<p>Rationale: It seems this gets thrown when the API server can't connect to the database.</p>
<p>Here's the list of postgres errors known to the PG gem:</p>
<p><a class="external" href="https://github.com/ged/ruby-pg/blob/daec80f91b9519509ca1694a231f11a75cb43f7f/ext/errorcodes.def#L598">https://github.com/ged/ruby-pg/blob/daec80f91b9519509ca1694a231f11a75cb43f7f/ext/errorcodes.def#L598</a></p>
<p><a class="external" href="https://github.com/ged/ruby-pg/blob/daec80f91b9519509ca1694a231f11a75cb43f7f/ext/pg_errors.c#L88">https://github.com/ged/ruby-pg/blob/daec80f91b9519509ca1694a231f11a75cb43f7f/ext/pg_errors.c#L88</a></p>
<p>Some other possible Exceptions to retry:</p>
<p>ConnectionBad<br />ConnectionException<br />ConnectionDoesNotExist<br />ConnectionFailure<br />TooManyConnections<br />CannotConnectNow<br />IdleSessionTimeout<br />ObjectInUse<br />LockNotAvailable<br />AdminShutdown<br />CrashShutdown</p>
<p>(There's a lot of connection related errors and I don't know the difference between them, but I included them all because it seems like those are very likely to be errors that occur through no fault of the client).</p> Arvados - Bug #21524 (New): test-provision-ubuntu2004 intermittently times out waiting for the co...https://dev.arvados.org/issues/215242024-02-15T14:26:08ZBrett Smithbrett.smith@curii.com
<p>This has happened repeatedly.</p>
<pre> Name: arvados-controller - Function: pkg.installed - Result: Changed Started: - 22:53:27.606297 Duration: 10607.743 ms
Name: arvados-controller - Function: service.running - Result: Changed Started: - 22:53:38.277567 Duration: 1113.901 ms
----------
ID: arvados-controller-service-running-service-ready-cmd-run
Function: cmd.run
Name: while ! (curl -k -s https://ubu20.local:8800 | \
grep -qE "req-[a-z0-9]{20}.{5}error_token") do
echo 'waiting for API to be ready...'
sleep 1
done
Result: False
Comment: Command "while ! (curl -k -s https://ubu20.local:8800 | \
grep -qE "req-[a-z0-9]{20}.{5}error_token") do
echo 'waiting for API to be ready...'
sleep 1
done
" run
Started: 22:53:39.397300
Duration: 120828.503 ms
Changes:
----------
pid:
37097
retcode:
1
stderr:
stdout:
while ! (curl -k -s https://ubu20.local:8800 | \
grep -qE "req-[a-z0-9]{20}.{5}error_token") do
echo 'waiting for API to be ready...'
sleep 1
done
: Timed out after 120 seconds
Name: nginx - Function: service.mod_watch - Result: Changed Started: - 22:55:40.233381 Duration: 63.787 ms
Summary for local
--------------
Succeeded: 144 (changed=106)
Failed: 1
--------------
</pre> Arvados - Bug #21521 (New): Uploading debian12 packages fails intermittentlyhttps://dev.arvados.org/issues/215212024-02-14T18:31:47ZBrett Smithbrett.smith@curii.com
<p>Because Jenkins hates me personally, occasionally build-packages-debian12 fails on the upload step:</p>
<pre>======= Start upload packages
/usr/local/arvados-dev/jenkins/run_upload_packages.py --repo dev -H jenkinsapt@apt.arvados.org -o Port=2222 --workspace /tmp/workspace/build-packages-debian12 debian12
Unable to open database, sleeping 8.753268149s, attempts left 10...
Unable to open database, sleeping 9.298699287s, attempts left 9...
Unable to open database, sleeping 8.465308774s, attempts left 8...
Unable to open database, sleeping 9.709491264s, attempts left 7...
Unable to open database, sleeping 8.100403568s, attempts left 6...
Unable to open database, sleeping 11.044444422s, attempts left 5...
Unable to open database, sleeping 11.153889278s, attempts left 4...
Unable to open database, sleeping 8.410966832s, attempts left 3...
Unable to open database, sleeping 10.525453171s, attempts left 2...
Unable to open database, sleeping 11.82935613s, attempts left 1...
ERROR: unable to reopen the DB, maximum number of retries reached
Traceback (most recent call last):
File "/usr/local/arvados-dev/jenkins/run_upload_packages.py", line 362, in <module>
main(sys.argv[1:])
File "/usr/local/arvados-dev/jenkins/run_upload_packages.py", line 358, in main
build_suite_and_upload(target, last_upload_ts, args)
File "/usr/local/arvados-dev/jenkins/run_upload_packages.py", line 348, in build_suite_and_upload
suite.update_packages(since_timestamp)
File "/usr/local/arvados-dev/jenkins/run_upload_packages.py", line 115, in update_packages
self.post_uploads(upload_paths)
File "/usr/local/arvados-dev/jenkins/run_upload_packages.py", line 237, in post_uploads
self._run_script(self.APT_SCRIPT, self.REMOTE_DEST_DIR + '/' + self.target,
File "/usr/local/arvados-dev/jenkins/run_upload_packages.py", line 193, in _run_script
subprocess.check_call(self._build_cmd(
File "/usr/lib/python3.9/subprocess.py", line 373, in check_call
raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['ssh', '-oPort=2222', '-q', 'jenkinsapt@apt.arvados.org', 'bash', '-ec', '\'\ncd "$1"; shift\nDISTNAME=$1; shift\nfor package in "$@"; do\n set +e\n aptly repo search "$DISTNAME" "${package%.deb}" >/dev/null 2>&1\n RET=$?\n set -e\n if [[ $RET -eq 0 ]]; then\n echo "Not adding $package, it is already present in repo $DISTNAME"\n rm "$package"\n else\n aptly repo add -remove-files "$DISTNAME" "$package"\n fi\ndone\naptly publish update "$DISTNAME" filesystem:"${DISTNAME%-*}":\n\'', 'DebianPackageSuite', 'tmp/debian12', 'bookworm-dev', "'arvados-sync-users_2.8.0~dev20240214163440-1_amd64.deb'", "'crunch-dispatch-slurm_2.8.0~dev20240214163440-1_amd64.deb'", "'python3-crunchstat-summary_2.8.0~dev20240213172635-1_amd64.deb'", "'keep-web_2.8.0~dev20240214163440-1_amd64.deb'", "'arvados-sync-groups_2.8.0~dev20240214163440-1_arm64.deb'", "'keepproxy_2.8.0~dev20240214163440-1_arm64.deb'", "'crunch-run_2.8.0~dev20240214163440-1_amd64.deb'", "'keep-exercise_2.8.0~dev20240214163440-1_arm64.deb'", "'python3-arvados-user-activity_2.8.0~dev20240213172635-1_amd64.deb'", "'crunch-dispatch-local_2.8.0~dev20240214163440-1_amd64.deb'", "'arvados-ws_2.8.0~dev20240214163440-1_arm64.deb'", "'keepstore_2.8.0~dev20240214163440-1_amd64.deb'", "'arvados-sync-users_2.8.0~dev20240214163440-1_arm64.deb'", "'arvados-git-httpd_2.8.0~dev20240214163440-1_amd64.deb'", "'crunch-dispatch-local_2.8.0~dev20240214163440-1_arm64.deb'", "'arvados-dispatch-lsf_2.8.0~dev20240214163440-1_amd64.deb'", "'arvados-controller_2.8.0~dev20240214163440-1_amd64.deb'", "'arvados-client_2.8.0~dev20240214163440-1_amd64.deb'", "'libpam-arvados-go_2.8.0~dev20240214163440-1_amd64.deb'", "'keep-block-check_2.8.0~dev20240214163440-1_arm64.deb'", "'arvados-client_2.8.0~dev20240214163440-1_arm64.deb'", "'arvados-ws_2.8.0~dev20240214163440-1_amd64.deb'", "'arvados-server_2.8.0~dev20240214163440-1_arm64.deb'", "'arvados-health_2.8.0~dev20240214163440-1_arm64.deb'", "'arvados-src_2.8.0~dev20240214163440-1_all.deb'", "'arvados-docker-cleaner_2.8.0~dev20240207214436-1_amd64.deb'", "'keep-balance_2.8.0~dev20240214163440-1_arm64.deb'", "'keepstore_2.8.0~dev20240214163440-1_arm64.deb'", "'keep-rsync_2.8.0~dev20240214163440-1_arm64.deb'", "'arvados-sync-groups_2.8.0~dev20240214163440-1_amd64.deb'", "'arvados-dispatch-lsf_2.8.0~dev20240214163440-1_arm64.deb'", "'keep-block-check_2.8.0~dev20240214163440-1_amd64.deb'", "'arvados-dispatch-cloud_2.8.0~dev20240214163440-1_arm64.deb'", "'crunch-run_2.8.0~dev20240214163440-1_arm64.deb'", "'arvados-git-httpd_2.8.0~dev20240214163440-1_arm64.deb'", "'keep-balance_2.8.0~dev20240214163440-1_amd64.deb'", "'arvados-health_2.8.0~dev20240214163440-1_amd64.deb'", "'keep-rsync_2.8.0~dev20240214163440-1_amd64.deb'", "'python3-arvados-fuse_2.8.0~dev20240213172635-1_amd64.deb'", "'arvados-workbench2_2.8.0~dev20240214163440-1_amd64.deb'", "'keep-exercise_2.8.0~dev20240214163440-1_amd64.deb'", 'python3-cwltest_2.3.20230108193615-1_amd64.deb', "'arvados-dispatch-cloud_2.8.0~dev20240214163440-1_amd64.deb'", "'keepproxy_2.8.0~dev20240214163440-1_amd64.deb'", "'python3-arvados-python-client_2.8.0~dev20240213172635-1_amd64.deb'", "'arvados-api-server_2.8.0~dev20240214163440-1_amd64.deb'", "'crunch-dispatch-slurm_2.8.0~dev20240214163440-1_arm64.deb'", "'arvados-controller_2.8.0~dev20240214163440-1_arm64.deb'", "'arvados-server_2.8.0~dev20240214163440-1_amd64.deb'", "'python3-arvados-cwl-runner_2.8.0~dev20240213172635-1_amd64.deb'", "'keep-web_2.8.0~dev20240214163440-1_arm64.deb'"]' returned non-zero exit status 1.
======= upload packages -- FAILED
======= End of upload packages (200s)
</pre>
<p>Those "Unable to open database" errors are from aptly. I don't understand what seems to be special about debian12 that it seems more prone to these database timeouts, but it definitely happens too often to be just bad luck.</p>
<p>The silver lining here is at least all this is infrastructure under our control so we can presumably fix it once we figure out what causes the issue.</p> Arvados - Bug #21434 (New): Follow up fix to schema saladhttps://dev.arvados.org/issues/214342024-01-31T16:46:46ZPeter Amstutzpeter.amstutz@curii.com
<p><a class="external" href="https://github.com/common-workflow-language/schema_salad/issues/766">https://github.com/common-workflow-language/schema_salad/issues/766</a></p> Arvados Workbench 2 - Bug #20898 (New): Fix narrowing issue in withDialog declarationhttps://dev.arvados.org/issues/208982023-08-25T02:08:16ZLisa Knox
<p>In src/store/dialog/with-dialog.ts, there is a 5 year old `TODO: fix this` comment on line 23. The fix is to pass `component as never` instead of just `component` into the connect function on line 26. The cause is a known issue with narrowing in Typescript that is slated to be fixed in Typescript 5.3.</p> Arvados - Bug #19781 (New): GUI crashes on a certain workflowhttps://dev.arvados.org/issues/197812022-11-18T14:59:18ZPeter Amstutzpeter.amstutz@curii.com
<p>Reported on Matrix by "Pluriscient":</p>
<blockquote>
<p>Using the GUI seems to crash for this workflow: getting a white screen and react error when I click on the file input<br />Think it's the file input in particular, I can change the rest of the inputs just fine<br />Workflow I'm trying is this one: <a class="external" href="https://gitlab.com/m-unlock/cwl/-/blob/master/cwl/workflows/workflow_illumina_quality.cwl">https://gitlab.com/m-unlock/cwl/-/blob/master/cwl/workflows/workflow_illumina_quality.cwl</a></p>
</blockquote> Arvados - Bug #18990 (New): should reflect the value of TLS/Insecure in the "Get API Token" dialoghttps://dev.arvados.org/issues/189902022-04-11T20:24:42ZWard Vandewegeward@curii.com
<p>When <code>TLS/Insecure</code> is set to <code>true</code>, the "Get API Token" dialog should say</p>
<pre><code>export ARVADOS_API_HOST_INSECURE=true</code></pre>
<p>and otherwise, it should say</p>
<pre><code>unset ARVADOS_API_HOST_INSECURE</code></pre>
<p>Currently, workbench2 always does the latter.</p> Arvados - Bug #16682 (New): Missing nodejs should not be a fatal error for submitting a workflowhttps://dev.arvados.org/issues/166822020-08-11T16:17:17ZPeter Amstutzpeter.amstutz@curii.com
<p>Uses nodejs to run jshint, but if nodejs is missing it becomes a fatal error. If we are just submitting it should be non-fatal if that's the only thing nodejs is needed for.</p> Arvados - Bug #16329 (New): Inconsistent upload behavior between wb1 and wb2https://dev.arvados.org/issues/163292020-04-14T14:07:28ZPeter Amstutzpeter.amstutz@curii.com
<p>A colleague asked for guidance on how to replace files in collections. She reported that in WB1 when uploading a file with the same name as a already existing file, the file is not replaced but instead a new file with the extension " (1).extension" is created (see screenshot).</p>
<p>I tested how the behaviour is in WB2 and there the file is replaced without asking for confirmation.</p>
<p>I think a good solution would be to leave the decision to the user whether the file should be replaced or saved with an extension. But at least the two workbenches should do it in the same way.</p> GET-Evidence - Bug #14622 (New): Annotate l7g CWL pipelinehttps://dev.arvados.org/issues/146222018-12-17T14:13:38ZBen Carr
<p>Old Ticket:<br />Tiling workflow documentation is out of date. It can be auto-generated partly directly from the cwl workflows.</p>
<p>----<br />SWZ:</p>
<p>CWL annotated:<br />[...]</p>
<p>CWL: Not annotated yet, and not experimental<br />[...]</p>
<p>I would like you to annotate the ones in the following folders as noted by the main workflows (and those cwl they call) called in the AD overview documents:</p>
<pre><code class="text syntaxhl">/cwl-version/filter/cwl
/cwl-version/clean/cwl
/cwl-version/convert2fastj/gvcf_version/cwl
/cwl-version/tilelib/cwl
/cwl-version/checks/check-sglf/cwl
/cwl-version/cgf3/cwl
/cwl-version/checks/check-cgf/gvcf/cwl/
/cwl-version/npy/cwl
</code></pre>
<p>Also please spellcheck and check for existing typos.<br />----<br />BHCC:</p>
<p>Quote Edit Delete<br />As of 20181212 the Following is Annotated :</p>
<pre><code class="text syntaxhl">./cgf3/cwl/createcgf.cwl
./cgf3/cwl/getdirs20.cwl
./cgf3/cwl/tiling_convert2cgf.cwl
./cgf3/cwl/getdirs.cwl
./checks/check-cgf/gvcf/cwl/validate-conversion-gvcf-cgf-chrom_workflow.cwl
./checks/check-cgf/gvcf/cwl/gather_validate-conversion-gvcf-cgf.cwl
./checks/check-cgf/gvcf/cwl/validate-conversion-gvcf-cgf-chrom.cwl
./checks/check-sglf/cwl/sglf-sanity-check.cwl
./tagset/l7g-tagset.cwl
./tagset/tagset.cwl
./convert2fastj/gvcf_version/cwl/tiling_convert2fastj_gvcf.cwl
./convert2fastj/gvcf_version/cwl/convertgvcf.cwl
./convert2fastj/gvcf_version/cwl/getdirs.cwl
./clean/cwl/tiling_clean_gvcf.cwl
./clean/cwl/getdirs.cwl
./clean/cwl/cleangvcf.cwl
./filter/cwl/filter.cwl
./filter/cwl/getCollections.cwl
./filter/cwl/tiling_filtergvcf.cwl
./tilelib/cwl/getpaths_chunk.cwl
./tilelib/cwl/tiling_createsglf_chunk-scatter_v2.cwl
./tilelib/cwl/merge-tilelib.cwl
./tilelib/cwl/createsglf_chunkv2.cwl
./npy/cwl/tiling_npy-wf.cwl
./npy/cwl/cwl_steps/tiling_create-npy.cwl
./npy/cwl/cwl_steps/tiling_consol-npy.cwl
</code></pre>
<p>BHCC:<br />Te branch we are working on is</p>
<p>14386-cwl-docum<br />I have reviewed the list for typographical errors, including grep'ing out all labels and running them through spellcheck. The only place where camel case was used is where it is used by convention for clarity (NumPy) of the package or object.</p>
<p>Things needed to be addressed:<br />Check spelling one more time<br />make sure every label starts with a capital<br />Remove any unnecessary Camel Case<br />Use the YML files as a guide, as variable names do not always reflect what is going on<br />Remove and Doc blocks that do not add anything beyond what the label covers.<br />Shorten any labels as much as possible, the more concise the better<br />Don't include the type of code C++ / bash unless absolutely necessary<br />Will likely need a new ticket for the automated generation of figures and may need to run view.cwl locally to make that work. Will test automated generation once the labels are merged to master as vew.cwl only currently works off GitHub-master.<br />We should consider renaming the "gff" in the variable names of the following scripts:</p>
<pre><code class="text syntaxhl">./filter/cwl/filter.cwl
./filter/cwl/tiling_filtergvcf.cwl
./convert2fastj/gvcf_version/cwl/tiling_convert2fastj_gvcf_named.cwl
./convert2fastj/gvcf_version/cwl/tiling_convert2fastj_gvcf.cwl
./convert2fastj/gvcf_version/cwl/convertgvcf.cwl
</code></pre><br />to reflect gVCF status Arvados - Bug #10545 (New): arvbox loops forever when something goes wronghttps://dev.arvados.org/issues/105452016-11-16T16:45:47ZJoshua Randalljr17@sanger.ac.uk
<p>When something goes wrong installing and starting services within arvbox (for example after running `arvbox reboot dev` or `arvbox reboot test`), arvbox makes no indication that there is a problem, it just keeps retrying the service repeatedly forever.</p>
<p>The result as a user is that it is very hard to tell the difference between arvbox taking a long time and arvbox being hopelessly stuck.</p>
<p>I suggest some retry counting on services, such that they can only fail consecutively n times (perhaps 3) before arvbox gives up and stops trying altogether.</p> Arvados - Bug #10535 (New): test suite hangs (in arvbox) in sdk/python test_callbackhttps://dev.arvados.org/issues/105352016-11-15T18:48:49ZJoshua Randalljr17@sanger.ac.uk
<pre>
********** Running sdk/python tests **********
running test
running egg_info
writing requirements to arvados_python_client.egg-info/requires.txt
writing arvados_python_client.egg-info/PKG-INFO
writing top-level names to arvados_python_client.egg-info/top_level.txt
writing dependency_links to arvados_python_client.egg-info/dependency_links.txt
writing pbr to arvados_python_client.egg-info/pbr.json
reading manifest file 'arvados_python_client.egg-info/SOURCES.txt'
reading manifest template 'MANIFEST.in'
writing manifest file 'arvados_python_client.egg-info/SOURCES.txt'
running build_ext
test_callback (tests.test_events.PollClientTestCase) …
</pre>
<p>This test gets stuck for an apparently infinite amount of time on my system, also with slowly increasing memory usage. Seems to be a race condition involving PollClient. Adding a time.sleep(0.1) before `self.logs.add(test_log.copy())` (<a class="external" href="https://github.com/curoverse/arvados/blob/0b5d04beb288175a285c36a38f255399dfe7d0d7/sdk/python/tests/test_events.py#L338-L339">https://github.com/curoverse/arvados/blob/0b5d04beb288175a285c36a38f255399dfe7d0d7/sdk/python/tests/test_events.py#L338-L339</a>) avoids the issue on my system.</p> GET-Evidence - Bug #6143 (New): Add 'nofollow' to download links for reporthttps://dev.arvados.org/issues/61432015-05-22T21:29:54ZAbram Connellyabram.connelly@gmail.com
<p>Web crawlers hammer the keep servers asking for the data files from links that appear in the reports. Add 'nofollow' on relevant links from the report page.</p> GET-Evidence - Bug #5883 (New): migrate to updated oauthhttps://dev.arvados.org/issues/58832015-05-01T18:31:33ZAbram Connellyabram.connelly@gmail.com
<p>There is a warning on that says OAuth2 is going away. We need to migrate the authentication on GET-Evidence to make sure login is still possible.</p>