Bug #15358
closed[cwl] CWL conformance test formattest2 fails with C locale
Description
When LANG=C instead of a UTF-8 locale like en_US.UTF-8, the CWL conformance test, v1.0/formattest2.cwl, fails with an encoding error trying to read EDAM.owl which contains UTF-8 characters, but doesn't have an XML encoding declaration in its prolog.
Test 65 failed: /home/ci/arvados-cwl-runner-with-checksum.sh --outdir=/tmp/tmp7oGCB5 --quiet v1.0/formattest2.cwl v1.0/formattest2-job.json Test format checking against ontology using subclassOf. Returned non-zero URI prefix 'edam' of 'edam:format_1929' not recognized, are you missing a $namespaces section? Could not load extension schema keep:29dc87213e125b67355699e8953d3820+62/EDAM.owl: 'ascii' codec can't decode byte 0xc3 in position 3352: ordinal not in range(128) ERROR Workflow execution failed: Expected value of 'input' to have format http://edamontology.org/format_2330 but File has an incompatible format: { "format": "http://edamontology.org/format_1929", "basename": "ref.fasta", "nameroot": "ref", "nameext": ".fasta", "location": "keep:23b1d68b203d6c75f314fe9804f50c0e+59/ref.fasta", "class": "File", "size": 12010 } ERROR Workflow error, try again with --debug for more information: Workflow did not return a result.
the offending XML snippet is:
<dc:creator>Matúš Kalaš</dc:creator>
Updated by Tom Morris over 5 years ago
- Subject changed from [cwl] CWL conformance test fails with C locale to [cwl] CWL conformance test formattest2 fails with C locale
Updated by Peter Amstutz over 5 years ago
This might just be an upstream conformance test fix to the XML file to declare the correct encoding (assuming the Python XML loader handles it).
Updated by Tom Morris over 5 years ago
Peter Amstutz wrote:
This might just be an upstream conformance test fix to the XML file to declare the correct encoding (assuming the Python XML loader handles it).
That may mask the bug, but since the default encoding for XML files is supposed to be UTF-8, it should work without an explicit declaration.
Updated by Peter Amstutz over 5 years ago
This might be a default encoding problem when reading from keep, not sure.
Updated by Peter Amstutz over 5 years ago
- Status changed from New to In Progress
Updated by Peter Amstutz over 5 years ago
- Target version changed from 2019-07-03 Sprint to 2019-07-17 Sprint
Updated by Peter Amstutz over 5 years ago
to reproduce:
$ export LANG=C $ arvados-cwl-runner formattest2.cwl formattest2-job.json INFO /home/peter/work/scripts/venv/bin/arvados-cwl-runner 1.4.0.20190627185953, arvados-python-client 1.4.0.20190627173408, cwltool 1.0.20190607183319 INFO Resolved 'formattest2.cwl' to 'file:///home/peter/work/common-workflow-language/v1.0/v1.0/formattest2.cwl' INFO Upload local files: "ref.fasta" INFO Using collection 23b1d68b203d6c75f314fe9804f50c0e+59 (4xphq-4zz18-gdz4bibpfb0e5ko) INFO Upload local files: "EDAM.owl" INFO Using collection 29dc87213e125b67355699e8953d3820+62 (4xphq-4zz18-qweb7yf0dbmqbir) Could not load extension schema keep:29dc87213e125b67355699e8953d3820+62/EDAM.owl: 'ascii' codec can't decode byte 0xc3 in position 3352: ordinal not in range(128) ...
Updated by Peter Amstutz over 5 years ago
15358-fetch-text-encoding @ 927d62b545e90676bf4729b6c1ebee56d51eacbe
Add encoding option to CollectionFsAccess.open() and use in fetch_text()
https://ci.curoverse.com/view/Developer/job/developer-run-tests/1366/
Updated by Peter Amstutz over 5 years ago
Running CWL tests here:
https://ci.curoverse.com/view/CWL/job/arvados-cwl-conformance-tests/177/
Updated by Peter Amstutz over 5 years ago
- Status changed from In Progress to Resolved
Applied in changeset arvados|88af3b04a254dbf32224c5fd90b7abe9be693501.
Updated by Peter Amstutz about 5 years ago
- Related to Bug #15655: [CWL] encoding error when printing error log tail added