Bug #17267

Can't run when $schema links are broken

Added by Peter Amstutz 9 months ago. Updated 8 months ago.

Status:
Resolved
Priority:
Normal
Assigned To:
Category:
CWL
Target version:
Start date:
01/18/2021
Due date:
% Done:

100%

Estimated time:
(Total: 0.00 h)
Story points:
-
Release relationship:
Auto

Description

Arvados uploads URLs in $schemas and converts them into keep references.

When it goes through and downloads URLs, a broken link is fatal error. This is a problem because these $schemas URLs are nonessential to running the workflow, and schema_salad can ignore broken URLs.

Arvados-cwl-runner needs to be able to ignore broken URLs in $schemas.


Subtasks

Task #17268: Review 17267-broken-schema-linksResolvedPeter Amstutz


Related issues

Related to Arvados - Bug #16477: arvados-cwl-runner is missing --skip-schemasResolved

Related to Arvados - Bug #11257: [CWL] Fails with $schemas referencing remote resources.Resolved

Associated revisions

Revision 56816829
Added by Peter Amstutz 9 months ago

Merge branch '17267-broken-schema-links' refs #17267

Arvados-DCO-1.1-Signed-off-by: Peter Amstutz <>

History

#1 Updated by Peter Amstutz 9 months ago

  • Description updated (diff)

#2 Updated by Peter Amstutz 9 months ago

  • Related to Bug #16477: arvados-cwl-runner is missing --skip-schemas added

#3 Updated by Peter Amstutz 9 months ago

  • Related to Bug #11257: [CWL] Fails with $schemas referencing remote resources. added

#4 Updated by Peter Amstutz 9 months ago

  • Status changed from New to In Progress

#5 Updated by Peter Amstutz 9 months ago

  • Assigned To set to Peter Amstutz

#7 Updated by Lucas Di Pentima 9 months ago

  • Given that this issue is IMO pretty critical, I think it would be convenient to have it tested.
  • The changes on pathmapper.py's visit() seems to be applicable for other callers than the one processing the $schema, maybe it would need an additional argument so that the caller can ask to ignore broken links on certain cases?
  • Related Q: Would a broken schema link make validation impossible, or the schemas have other uses?

#8 Updated by Peter Amstutz 9 months ago

Lucas Di Pentima wrote:

  • Given that this issue is IMO pretty critical, I think it would be convenient to have it tested.

Yes, I wasn't quite sure if a unit test would make sense, but I could add an integration test.

  • The changes on pathmapper.py's visit() seems to be applicable for other callers than the one processing the $schema, maybe it would need an additional argument so that the caller can ask to ignore broken links on certain cases?

So, this only applies to things that are referenced in the CWL document with http links, that need to be downloaded on the fly into Arvados.

The prior behavior was that, if a http download failed, it would log an error, throw a fatal exception.

The new behavior is that if a http download fails, it logs an error, does not add an entry in the "mapper". If something tries to look the mapping without checking to see if it exists first, it will throw a fatal exception.

So I think the behavior is functionally the same.

  • Related Q: Would a broken schema link make validation impossible, or the schemas have other uses?

No, they are entirely optional. The only thing they are really used for is to check if your annotations appear in the vocabulary of the extension schema, and warn if not found. It is more to assist with spellchecking than anything.

#9 Updated by Peter Amstutz 9 months ago

17267-broken-schema-links @ b01ff2414daaf5fd8ff7f0e78ed49e63d431ccd3

Added integration test.

#10 Updated by Lucas Di Pentima 9 months ago

This LGTM, thanks!

#11 Updated by Peter Amstutz 9 months ago

  • Status changed from In Progress to Resolved

#12 Updated by Peter Amstutz 8 months ago

  • Release set to 37

Also available in: Atom PDF