Skip to content

Keyword Spotting (KWS)

Released Plugins

All of our current QBE plugins match the description on this page. Refer to the list below for the currently supported Speaker Detection plugins.

Legacy Plugins (OLIVE 4.x Compatible)

OLIVE Version Plugin Description
OLIVE 4.12+ kws-dynapy-v1 DynaPy-based keyword spotting plugin improving the overall KWS system infrastructure and performance.

Description

Keyword Spotting plugins are designed to allow users to detect and label targeted keywords or keyphrases that are defined by text input from the user at score time. It is based on automatic speech recognition (ASR) technology, featuring speech-to-text transcription and language modeling. This means that every KWS domain is language dependent, and that keywords can only be detected if they exist in the underlying ASR system's dictionary, making out-of-vocabulary keywords a potential problem.

Inputs

An audio file or buffer and a list of desired keywords or keyphrases to detect.

Outputs

When one or more of the enrolled keywords has been detected in the submitted audio, KWS returns a region or list of timestamped regions, each with a score for the keyword that has been detected.

Enrollments

KWS plugins do not support enrollments; instead, the set of classes the plugin is searching for is provided as a list of text keywords with each scoring request. Note that each domain is language dependent, and that a word or phrase in a language other than the one the domain is trained in is likely to be out-of-vocabulary. This means the word or phrase will be difficult or impossible to recall. For details on how to set or pass these keywords, please refer to the appropriate sections within the OLIVE CLI User Guide or OLIVE API Documentation.

Limitations

As was previously mentioned, traditional KWS relies on an underlying ASR system, making each KWS domain completely language dependent. This places several real restrictions on the users. First, this means that for keywords to be detectable, the words must be part of the ASR system's dictionary. This may make it difficult to find some keywords or phrases, like names, brands, slang or other colloquialisms, if they are out-of-vocabulary. This also makes it more difficult to deal with speakers or situations that may involve code switching. This reliance on ASR also makes KWS plugins quite heavy with respect to resource requirements, and also quite slow compared to other plugin types.

An additional limitation stemming from the language dependence of the system is that if a user would like to detect keywords in a new language that isn't currently offered by SRI, a new domain for that language would need to be created, which requires a large amount of transcribed audio in that language.

Interface

For command line interface use see the appropriate section of the OLIVE CLI User Guide. For API usage see the appropriate section of the OLIVE Application Programming Interface Guide.