Feature #18651

Design for helper methods for working with vocabularies

Added by Peter Amstutz 4 months ago. Updated 4 months ago.

Status:
Resolved
Priority:
Normal
Assigned To:
Category:
SDKs
Target version:
Start date:
02/02/2022
Due date:
% Done:

100%

Estimated time:
(Total: 0.00 h)
Story points:
-

Description

Design sketch to support using the vocabulary in the Python SDK, for review.

New class called Vocabulary which is initialized from the vocabulary JSON. It can be initialized from an API object, in which case it fetches the vocabulary file itself.

The Vocabulary has a dict, "key_aliases".

The key_aliases has an entry for every alias of every key, as well as the formal identifiers. The value is an object of type VocabularyKey.

The VocabularyKey class has fields "identifier" (string), "aliases" (list of strings) and "value_aliases".

The "value_aliases" field has an entry for every alias of every value associated with this key, as well as the formal identifiers. The value is on object of type VocabularyValue.

The VocabularyValue class has fields "identifier" (string) and "aliases" (list of strings).

Support case-insensitive lookup of aliases. Suggest indexing aliases as all-lowercase. Lookups check the check the case-insensitive version.

This is intended to be used to easily convert "properties" between aliases and formal identifers. There will be a Vocabulary.convert_to_labels() and Vocabulary.convert_to_identifiers() methods. The first normalizes based on human-readable labels, the second normalizes based to the machine identifiers.

The basic usage would be something like this

vocab = Vocabulary(vocab_json)

vocabkey = vocab.key_aliases["species"]
print(vocabkey.identifier)  # "id123" 
print(vocabkey.aliases)     # ["species", "animal"]

vocabvalue = vocabkey.value_aliases["human"]
print(vocabvalue.identifier)  # "id456" 
print(vocabvalue.aliases)     # ["homo sapiens", "human"]

vocab.convert_to_identifiers({"animal": "human"})
# -> {"id123": "id456"}

vocab.convert_to_labels({"id123": "id456"})
# -> {"species": "homo sapiens"}

Additional thought: could have add an indexer on the object so that you can just write vocab["species"]["human"].identifier.

Suggestion from Tom: wrapper methods which automatically translate the properties field to/from identifiers on get, list, create, update, etc.


Subtasks

Task #18653: group design/reviewResolved

History

#1 Updated by Peter Amstutz 4 months ago

  • Description updated (diff)

#2 Updated by Peter Amstutz 4 months ago

  • Description updated (diff)

#3 Updated by Lucas Di Pentima 4 months ago

Looks good to me. Some thoughts:

  • Label matching may need to be case-insensitive.
  • Is this helper meant to be used directly by PySDK users or would it exclusively be an internal component like the BlockManager class? If the former is the case, we might need to think about additional features like listing all available key preferred labels (the first one of every key id) and searching with/without synonyms.

#4 Updated by Peter Amstutz 4 months ago

  • Status changed from New to In Progress

#5 Updated by Peter Amstutz 4 months ago

  • Description updated (diff)

Lucas Di Pentima wrote:

Looks good to me. Some thoughts:

  • Label matching may need to be case-insensitive.

Good idea, I added it.

  • Is this helper meant to be used directly by PySDK users or would it exclusively be an internal component like the BlockManager class? If the former is the case, we might need to think about additional features like listing all available key preferred labels (the first one of every key id) and searching with/without synonyms.

Directly by PySDK users.

Do you have suggestions about other use cases? I'm mainly looking at translation between the aliases and the formal identifier.

#6 Updated by Peter Amstutz 4 months ago

  • Description updated (diff)

#7 Updated by Peter Amstutz 4 months ago

  • Status changed from In Progress to Resolved

Also available in: Atom PDF