Story #14151

Extend vocabulary support for properties to support strong identifiers and multiple labels

Added by Tom Morris 6 months ago. Updated 4 months ago.

Assigned To:
Target version:
Start date:
Due date:
% Done:


Estimated time:
Story points:


In #12479 support was added for controlled vocabularies of tags and values when setting and updating properties on collections, but, on the backend, that feature just used the existing string based properties which means there is no easy way to track vocabulary label changes, alternate labels, multiple languages, etc.

As a system administrator, I would like the ability to configure the vocabulary that the users are allowed to use when editing properties and be able to update the labels while retaining the same concept identifiers.

As a user, I would like the ability to search on any of a number of alternate terms for a common concept. For example, I'd like to be able to search for either "human" or "homo sapiens" in the context of "species" and have it return the same concept.

I would also like the option of viewing a definition of the concept, preferred label, and other related information to help make sure I'm choosing the correct term. For hierarchical vocabularies, it'd be desirable to match on all child concepts of a parent concept when searching. The need for the additional capabilities in this paragraph are TBD.

Design sketch:

  • UI implemented in Workbench2
  • Vocabulary terms (predicates/property keys and values) have an id and one or more labels. There should be a way of deciding on a preferred or primary label for display.
  • Properties have a range of valid values (requires some sort of schema)
  • Restrict label search on property range and language
  • When user starts typing, search labels (possibly also description/definition text) and present an autocomplete list
  • Alternately, user may start with a full list with most commonly used terms at the top
  • When user selects desired label, store the id in the property
  • When viewing properties, display the primary label associated with an id for the current language
  • Permissible to have duplicate labels (???) provided can be differentiated otherwise (different languages? map to ids in different property ranges?)

To be decided:

  • How is schema / range of valid values for a property expressed (RDFS/OWL or something else?)
  • API of backend service. Something like: "return ranked list of terms that match search query X for property range Y in language Z" or "list all terms for property range Y"
  • What to do when a localized label isn't available
  • Backend technology. Some options: part of API server, new postgres table, full text search like elastic search, a third party vocabulary service, a triple store?

Related issues

Related to Arvados - Story #12479: [Workbench] Extend tag/property editing to support a structured vocabularyResolved2017-10-24


#1 Updated by Peter Amstutz 6 months ago

This seems like an obvious application for an ontology.

#2 Updated by Tom Morris 6 months ago

  • Target version changed from Arvados Future Sprints to To Be Groomed

#3 Updated by Peter Amstutz 5 months ago

  • Description updated (diff)

#4 Updated by Peter Amstutz 5 months ago

  • Description updated (diff)

#5 Updated by Peter Amstutz 5 months ago

  • Description updated (diff)

#7 Updated by Peter Amstutz 4 months ago

  • Related to Story #12479: [Workbench] Extend tag/property editing to support a structured vocabulary added

Also available in: Atom PDF