Project

General

Profile

Actions

Bug #15358

closed

[cwl] CWL conformance test formattest2 fails with C locale

Added by Tom Morris almost 5 years ago. Updated over 4 years ago.

Status:
Resolved
Priority:
Normal
Assigned To:
Category:
-
Target version:
Story points:
-
Release relationship:
Auto

Description

When LANG=C instead of a UTF-8 locale like en_US.UTF-8, the CWL conformance test, v1.0/formattest2.cwl, fails with an encoding error trying to read EDAM.owl which contains UTF-8 characters, but doesn't have an XML encoding declaration in its prolog.

Test 65 failed: /home/ci/arvados-cwl-runner-with-checksum.sh --outdir=/tmp/tmp7oGCB5 --quiet v1.0/formattest2.cwl v1.0/formattest2-job.json
Test format checking against ontology using subclassOf.
Returned non-zero
URI prefix 'edam' of 'edam:format_1929' not recognized, are you missing a $namespaces section?
Could not load extension schema keep:29dc87213e125b67355699e8953d3820+62/EDAM.owl: 'ascii' codec can't decode byte 0xc3 in position 3352: ordinal not in range(128)
ERROR Workflow execution failed:
Expected value of 'input' to have format http://edamontology.org/format_2330 but
  File has an incompatible format: {
    "format": "http://edamontology.org/format_1929", 
    "basename": "ref.fasta", 
    "nameroot": "ref", 
    "nameext": ".fasta", 
    "location": "keep:23b1d68b203d6c75f314fe9804f50c0e+59/ref.fasta", 
    "class": "File", 
    "size": 12010
}
ERROR Workflow error, try again with --debug for more information:
Workflow did not return a result.

the offending XML snippet is:

<dc:creator>Matúš Kalaš</dc:creator>

Subtasks 1 (0 open1 closed)

Task #15393: Review 15358-fetch-text-encoding ResolvedEric Biagiotti07/03/2019Actions

Related issues

Related to Arvados - Bug #15655: [CWL] encoding error when printing error log tailResolvedEric Biagiotti10/09/2019Actions
Actions #1

Updated by Tom Morris almost 5 years ago

  • Subject changed from [cwl] CWL conformance test fails with C locale to [cwl] CWL conformance test formattest2 fails with C locale
Actions #3

Updated by Peter Amstutz almost 5 years ago

This might just be an upstream conformance test fix to the XML file to declare the correct encoding (assuming the Python XML loader handles it).

Actions #4

Updated by Tom Morris almost 5 years ago

  • Target version set to 2019-07-03 Sprint
Actions #5

Updated by Tom Morris almost 5 years ago

Peter Amstutz wrote:

This might just be an upstream conformance test fix to the XML file to declare the correct encoding (assuming the Python XML loader handles it).

That may mask the bug, but since the default encoding for XML files is supposed to be UTF-8, it should work without an explicit declaration.

Actions #6

Updated by Peter Amstutz almost 5 years ago

  • Assigned To set to Peter Amstutz
Actions #7

Updated by Peter Amstutz almost 5 years ago

This might be a default encoding problem when reading from keep, not sure.

Actions #8

Updated by Peter Amstutz almost 5 years ago

  • Status changed from New to In Progress
Actions #9

Updated by Peter Amstutz almost 5 years ago

  • Target version changed from 2019-07-03 Sprint to 2019-07-17 Sprint
Actions #10

Updated by Peter Amstutz almost 5 years ago

to reproduce:

$ export LANG=C
$ arvados-cwl-runner formattest2.cwl formattest2-job.json 
INFO /home/peter/work/scripts/venv/bin/arvados-cwl-runner 1.4.0.20190627185953, arvados-python-client 1.4.0.20190627173408, cwltool 1.0.20190607183319
INFO Resolved 'formattest2.cwl' to 'file:///home/peter/work/common-workflow-language/v1.0/v1.0/formattest2.cwl'
INFO Upload local files: "ref.fasta" 
INFO Using collection 23b1d68b203d6c75f314fe9804f50c0e+59 (4xphq-4zz18-gdz4bibpfb0e5ko)
INFO Upload local files: "EDAM.owl" 
INFO Using collection 29dc87213e125b67355699e8953d3820+62 (4xphq-4zz18-qweb7yf0dbmqbir)
Could not load extension schema keep:29dc87213e125b67355699e8953d3820+62/EDAM.owl: 'ascii' codec can't decode byte 0xc3 in position 3352: ordinal not in range(128)
...
Actions #11

Updated by Peter Amstutz almost 5 years ago

15358-fetch-text-encoding @ 927d62b545e90676bf4729b6c1ebee56d51eacbe

Add encoding option to CollectionFsAccess.open() and use in fetch_text()

https://ci.curoverse.com/view/Developer/job/developer-run-tests/1366/

Actions #13

Updated by Peter Amstutz almost 5 years ago

  • Status changed from In Progress to Resolved
Actions #14

Updated by Peter Amstutz over 4 years ago

  • Related to Bug #15655: [CWL] encoding error when printing error log tail added
Actions #15

Updated by Peter Amstutz over 4 years ago

  • Release set to 22
Actions

Also available in: Atom PDF