Metadata

Input and output documents gain parallel "input metadata" and "output metadata" documents.

The input metadata is parallel to the input document, except that it describes metadata about each parameter. For example, for this input:

foo:
  class: File
  location: keep:abc+123/foo.bam

the metadata might be:

foo:
  xyz:sample: sample1122333
  xyz:sequencer: illumina

On workflow start, reverse lookup PDH to UUID and find all tags. Create input metadata document.

On workflow end, create output metadata document based on "arv:PropagateMetadata".

hints:
  arv:PropagateMetadata:
    setMetadata:
      - outputParameter: bar
        valueFrom: $(metadata.foo)

Resulting in output:

bar:
  class: File
  location: keep:abc+123/result.vcf

with output metadata:

bar:
  xyz:sample: sample1122333
  xyz:sequencer: illumina

Arvados then creates tags to the output collection based on output metadata document. There are several possible representations (with mapping to RDF triples model)

link_class: tag
name: xyz:sample
head_uuid: zzzzz-4zz18-zzzzzzzzzzzzzzz
properties: 
  value: sample1122333
  path: result.vcf
  • head_uuid+properties[path] -> subject
  • name -> predicate
  • properties[value] -> object

Alternate representation:

link_class: xyz:samplename
name: sample1122333
head_uuid: zzzzz-4zz18-zzzzzzzzzzzzzzz
properties: 
  path: result.vcf
  • head_uuid+properties[path] -> subject
  • link_class -> predicate
  • name -> object
link_class: xyz:samplename
name: result.vcf
head_uuid: zzzzz-4zz18-zzzzzzzzzzzzzzz
properties: 
  value: sample1122333
  • head_uuid+name -> subject
  • link_class -> predicate
  • properties[value] -> object