Feature #18651
closedDesign for helper methods for working with vocabularies
Description
Design sketch to support using the vocabulary in the Python SDK, for review.
New class called Vocabulary
which is initialized from the vocabulary JSON. It can be initialized from an API object, in which case it fetches the vocabulary file itself.
The Vocabulary
has a dict, "key_aliases".
The key_aliases has an entry for every alias of every key, as well as the formal identifiers. The value is an object of type VocabularyKey
.
The VocabularyKey
class has fields "identifier" (string), "aliases" (list of strings) and "value_aliases".
The "value_aliases" field has an entry for every alias of every value associated with this key, as well as the formal identifiers. The value is on object of type VocabularyValue
.
The VocabularyValue
class has fields "identifier" (string) and "aliases" (list of strings).
Support case-insensitive lookup of aliases. Suggest indexing aliases as all-lowercase. Lookups check the check the case-insensitive version.
This is intended to be used to easily convert "properties" between aliases and formal identifers. There will be a Vocabulary.convert_to_labels()
and Vocabulary.convert_to_identifiers()
methods. The first normalizes based on human-readable labels, the second normalizes based to the machine identifiers.
The basic usage would be something like this
vocab = Vocabulary(vocab_json) vocabkey = vocab.key_aliases["species"] print(vocabkey.identifier) # "id123" print(vocabkey.aliases) # ["species", "animal"] vocabvalue = vocabkey.value_aliases["human"] print(vocabvalue.identifier) # "id456" print(vocabvalue.aliases) # ["homo sapiens", "human"] vocab.convert_to_identifiers({"animal": "human"}) # -> {"id123": "id456"} vocab.convert_to_labels({"id123": "id456"}) # -> {"species": "homo sapiens"}
Additional thought: could have add an indexer on the object so that you can just write vocab["species"]["human"].identifier
.
Suggestion from Tom: wrapper methods which automatically translate the properties field to/from identifiers on get, list, create, update, etc.
Updated by Lucas Di Pentima almost 3 years ago
Looks good to me. Some thoughts:
- Label matching may need to be case-insensitive.
- Is this helper meant to be used directly by PySDK users or would it exclusively be an internal component like the
BlockManager
class? If the former is the case, we might need to think about additional features like listing all available key preferred labels (the first one of every key id) and searching with/without synonyms.
Updated by Peter Amstutz almost 3 years ago
- Status changed from New to In Progress
Updated by Peter Amstutz almost 3 years ago
- Description updated (diff)
Lucas Di Pentima wrote:
Looks good to me. Some thoughts:
- Label matching may need to be case-insensitive.
Good idea, I added it.
- Is this helper meant to be used directly by PySDK users or would it exclusively be an internal component like the
BlockManager
class? If the former is the case, we might need to think about additional features like listing all available key preferred labels (the first one of every key id) and searching with/without synonyms.
Directly by PySDK users.
Do you have suggestions about other use cases? I'm mainly looking at translation between the aliases and the formal identifier.
Updated by Peter Amstutz almost 3 years ago
- Status changed from In Progress to Resolved