Skip to content

OLIVE API Message Protocol Documentation


olive.proto Protocol Buffer Definitions

The messages defined on this page are what define the OLIVE Enterprise API. These messages are how a client application will interact with and provide tasks to an OLIVE Server. The format used by the OLIVE API is based on the Google Protocol Buffer.

For more information on how to integrate the ability to send and receive these messages into a client application using a provided Client API implementation from SRI, refer to the Integrating the (Java) Client API Guide.

For more information on creating your own reference implementation with the functionality of these messages, head over to the Creating an API Reference Implementation page that offers guidelines and information.

Server Management Messages

GetActiveRequest

Message to request the list of ScenicMessages that are still active

GetActiveResult

Response to GetActiveRequest containing the ScenicMessages that are still active

Field Type Label Description
message_id string  repeated  List containing the IDs of each message still being processed on the server
total_num  string required Total number of messages still being processed

GetStatusRequest

Request a simple server status message, similar to a heartbeat except the request reply is sent on the status port and is requested by the client

GetStatusResult

The result of a GetStatusRequest

Field Type Label Description
num_pending uint32 required The number of pending jobs
num_busy uint32 required the number of active jobs
num_finished uint32 required The number of finished jobs

Heartbeat

A heartbeat Message, acknowledging that the server is running, this message is continuously broadcast by the server on it's status port (this is the only message sent on the status port)

Field Type Label Description
stats ServerStats optional The current status of the server, optional since status is not available when the server first starts, but a heartbeat is still sent
logdir string  required  The location where the server writes it log files

ServerStats

Current status of the OLIVE server, sent as part of a Heatbeat message.

Field Type Label Description
cpu_percent float required The current percentage of CPU used
cpu_average float required The average CPU percentage used since the server was started
mem_percent float required The percentage of memory used
max_mem_percent float required The most memory used since the server was started
swap_percent float required The current swap used
max_swap_percent float required The max (most) swap space used since the server was started
pool_busy uint32 required The number of jobs currently running on the server
pool_pending uint32 required The number of jobs queued on the server
pool_finished uint32 required The number of jobs completed
pool_reinit bool required The number of jobs that need to be re-ran
max_num_jobs uint32 optional The max number of concurrent jobs
server_version string optional The current version of the server

LoadPluginDomainRequest

Request a plugin be pre-loaded to optimize later score request(s)

Field Type Label Description
plugin string required The plugin
domain string required The domain

LoadPluginDomainResult

Acknowledgment that a plugin is being loaded

Field Type Label Description
successful bool required True if the request is being loaded (but receipt of this message does not guarantee the plugin has finished loaded)

RemovePluginDomainRequest

Unload (remove from memory) a previously loaded plugin. Use to free resources on the server or force reloading of a plugin/domain

Field Type Label Description
plugin string required The plugin to remove
domain string optional The domain to remove, if omitted all domains removed for this plugin.

RemovePluginDomainResult

Acknowledgment that a plugin/domain has been removed (unloaded)

Field Type Label Description
successful bool required True if the plugin was been removed

PluginDirectoryRequest

Use a PluginDirectoryRequest message to receive the list of plugins available on the server. A Plugin performs tasks such as SAD, LID, SID, or KWS. There may be multiple plugins registered for a given task. A plugin typically has one or more Domains. Plugins contain the code of the recognizer, while Domains correspond to a particular training or adaptation sessions. Domains therefore represent the data/conditions. A plugin and domain together are necessary to perform scoring.

PluginDirectoryResult

The collection of plugins available on the server, response to PluginDirectoryRequest

Field Type Label Description
plugins Plugin repeated The available plugins

Global Scorer Messages

GlobalScorer is an OLIVE Plugin Trait defining a/the type of scoring the plugin is capable of. For more details about what it means to be a GlobalScorer, refer to the relevant section of the Plugin Traits page.

For an example of the code that would be used to build and submit a GlobalScorer request, refer to the relevant section of the Integrating the (Java) Client API Guide.

GlobalScore

The global score for a class

Field Type Label Description
class_id string required The class
score  float required The score associated with the class
confidence float optional An optional confidence value when part of a calibration report
comment string optional An optional suggested action when part of a calibration report

GlobalScorerRequest

Request global scoring using the specified plugin. The plugin must implement the GlobalScorer trait to handle this request. If this request is successful, then one set of scores is returned since the audio submission is assumed to be mono. If submitting multichannel audio then the audio is merged (unless a channel specified in the Audio message, then that channel is used) to produce one set of global scores.

Field Type Label Description
plugin string required The plugin to invoke
domain string required The domain
audio Audio optional The audio to analyze/score. Either audio or vector must be set.
vector AudioVector optional The preprocessed audio vector to analyze/score. Either audio or vector must be set.
option OptionValue repeated Any options specified
class_id string repeated Optionally specify the classes to be scored

GlobalScorerResult

The result from a GlobalScorerRequest, having zero or more GlobalScore elements

Field Type Label Description
score GlobalScore repeated The class scores

Region Scorer Messages

RegionScorer is an OLIVE Plugin Trait defining a/the type of scoring the plugin is capable of. For more details about what it means to be a RegionScorer, refer to the relevant section of the Plugin Traits page.

For an example of the code that would be used to build and submit a RegionScorer request, refer to the relevant section of the Integrating the (Java) Client API Guide.

RegionScore

The basic unit of a region score. There may be multiple RegionScore values in a RegionScorerResult

Field Type Label Description
start_t int32 required Begin-time of the region (in seconds)
end_t int32 required End-time of the region (in seconds)
class_id string required Class ID associated with region
score float optional Optional score associated with the class_id label

RegionScorerRequest

Request region scoring for the specified plugin/domain. The plugin must implement the RegionScorer trait to handle this request. If this request is successful, then one set of scores is returned since the audio submission is assumed to be mono. If submitting multichannel audio then the audio is merged (unless a channel specified in the Audio message, then that channel is used) to produce one set of region scores.

Field Type Label Description
plugin string required The plugin to invoke
domain string required The domain
audio Audio required The audio to analyze/score
option  OptionValue repeated Any options specified
class_id string repeated Optionally specify the classes to be scored

RegionScorerResult

The set of region score results, response to RegionScorerRequest

Field Type Label Description
region  RegionScore repeated The scored regions

Frame Scorer Messages

FrameScorer is an OLIVE Plugin Trait defining a/the type of scoring the plugin is capable of. For more details about what it means to be a FrameScorer, refer to the relevant section of the Plugin Traits page.

For an example of the code that would be used to build and submit a FrameScorer request, refer to the relevant section of the Integrating the (Java) Client API Guide.

FrameScorerRequest

Request frame scoring using the specified plugin and audio. The plugin must implement the FrameScorer trait to handle this request. If this request is successful, then one set of scores is returned since the audio submission is assumed to be mono. If submitting multichannel audio then the audio is merged (unless a channel specified in the Audio message, then that channel is used) to produce one set of frame scores.

Field Type Label Description
plugin string required The plugin to invoke
domain string required The domain
audio Audio required The audio to analyze/score
option OptionValue repeated Any options specified
class_id string repeated Optionally specify the classes to be scored

FrameScorerResult

The results from a FrameScorerRequest

Field Type Label Description
result FrameScores repeated List of frame scores by class_id

FrameScores

The basic unit of a frame score, returned in a FrameScorerRequest

Field Type Label Description
class_id string required The class ID to which the frame scores pertain
frame_rate int32 required The number of frames per second
frame_offset double required The offset to the center of the frame 'window'
score  double  repeated The frame-level scores for the class_id

Text Transformation Messages

TextTransformer is an OLIVE Plugin Trait for scoring using text (instead of audio). For more details about what it means to be a TextTransformer, refer to the relevant section of the Plugin Traits page.

TextTransformationRequest

Request the transformation of a text/string using MT

Field Type Label Description
plugin string required The plugin to invoke
domain string required The domain
text string optional The string text to analyze/score, Optional as of OLIVE 5.0 since data input(s) can be specified as part of a workflow
option OptionValue repeated Any options specified
class_id string repeated Optionally specify the classes to be scored

TextTransformationResult

The response to a TextTransformationRequest

Field Type Label Description
transformation TextTransformation repeated

TextTransformation

The text value returned in a TextTransformationResult

Field Type Label Description
class_id string required A classifier for this result, usually just 'text'
transformed_text string required The text result

Audio Alignment Messages

AudioAlignmentScorer is an OLIVE Plugin Trait for alignment of two or more audio inputs. For more details about what it means to be a TextTransformer, refer to the relevant section of the Plugin Traits page.

AudioAlignmentScoreRequest

Request the alignment of two or more audio inputs

Field Type Label Description
plugin string required The plugin to invoke
domain string required The domain
audios Audio repeated The audio to analyze/score, Optional as of OLIVE 5.0 since Audio can be specified
option OptionValue repeated as part of a workflow. If specified there should be two or more audio inputs

Any options specified | | class_id | string | repeated | Optionally specify the classes to be scored |

AudioAlignmentScoreResult

The result of a AudioAlignmentScoreRequest

Field Type Label Description
scores AudioAlignmentScore repeated

AudioAlignmentScore

A score in an AudioAlignmentScoreResult

Field Type Label Description
reference_audio_label string required The source or reference audio name (file 1)
other_audio_label string required The name of the audio input in comparison (file 2)
shift_offset float required shift offset between the audion in the reference and
confidence float required The confidence of this score

Global Comparer Messages

GlobalComparerReport

The visual representation of a global comparison

Field Type Label Description
type ReportType required The type of report (normally a PDF)
report_data bytes required The serialized report

GlobalComparerRequest

Request the comparison of two audio submission. The plugin must implement the GlobalComparer trait to handle this request

Field Type Label Description
plugin string required The plugin to invoke
domain string required The domain
audio_one Audio required One of two audio submissions to analyze/score.
audio_two Audio required One of two audio submissions to analyze/score.
option OptionValue repeated Any options specified
class_id string repeated Optionally specify the classes to be scored // todo remove!

GlobalComparerResult

The result of a GlobalComparerRequest

Field Type Label Description
results Metadata repeated The metadata/scores returned from a global compare analysis
report GlobalComparerReport repeated A comparison report generated by the plugin

ReportType

Possible report formats

Name Number Description
PDF 1
PNG 2
GIF 3
JPEG 4
TIFF 5

Class Modifier Messages

ClassModificationRequest

Request a modification of a class for the specified plugin. The plugin must implement the ClassModifier Trait to handle this request.

Field Type Label Description
plugin string required The plugin
domain string required The domain
class_id string required The ID of the class being enrolled/modified
addition Audio repeated List of Audio, action pairs to apply to the class
removal Audio repeated List of Audio, action pairs to apply to the class
addition_vector AudioVector repeated List of preprocessed audio vector to apply to the class
finalize bool optional Whether or not to finalize the class. You can send multiple ClassModificationRequests and only finalize on the last request for efficiency. Default: true
option OptionValue repeated Any modification options

ClassModificationResult

Response to ClassModificationRequest.

Field Type Label Description
addition_result AudioResult repeated Provides feedback about the success/failure of individual audio additions
removal_result AudioResult repeated Provides feedback about the success/failure of individual audio removals
vector_addition_result AudioResult repeated Provides feedback about the success/failure of individual audio vector additions

ClassRemovalRequest

Request removal of a class in the specified plugin/domain

Field Type Label Description
plugin string required The plugin
domain string required The domain
class_id string required The id of the class to be removed

ClassRemovalResult

Acknowledgment that a ClassRemovalRequest was received

AudioResult

The feedback/description of class modification for a result in a ClassModificationResult message

Field Type Label Description
successful bool required Whether or not the individual audio succeeded
message string optional Description of what occurred on this audio

Audio Converter Messages

AudioModification

The contents of an AudioModificationResult.

Field Type Label Description
audio AudioBuffer required The transformed audio
message string required Description of how this audio was transformed, or an error description. Not sure if needed?
scores Metadata repeated Zero or more scores (metadata) about the modifed audio. Metadata is a list of name/value pairs.

AudioModificationRequest

Request enhancement (modification) of the submitted audio. The plugin must support the AudioConverter trait to support this request

Field Type Label Description
plugin string required The plugin
domain string required The domain
requested_channels uint32 required Convert audio to have this number of channels
requested_rate uint32 required Convert audio to this sample rate
modifications Audio repeated List of Audio, action pairs to apply to the class - may have to limit to one audio submission per request, not sure how to handle multiple results
option OptionValue repeated Any options specified

AudioModificationResult

The result of an AudioModificationRequest

Field Type Label Description
successful bool required Whether or not the individual audio modification succeeded
modification_result AudioModification repeated Provides feedback about the success/failure of individual audio additions.

Audio Vectorizer Messages

AudioVector

Represents audio preprocessed by a plugin/domain.

Field Type Label Description
plugin string required The origin plugin
domain string required The origin domain
data bytes required The audio vector data, varies by plugin
params Metadata repeated Name/value parameter data generated by the plugin and needed for later processing

PluginAudioVectorRequest

Request one or more audio submissions be vectorized (preprocesssed) by the specified plugin. The resulting vectorized audio can only be processed by the same plugin. A plugin must support the AudioVectorizer Trait to support this request.

Field Type Label Description
plugin string required The plugin
domain string required The domain
addition Audio repeated List of Audio to process

PluginAudioVectorResult

The result of a PluginAudioVectorRequest, containing a set of VectorResults

Field Type Label Description
vector_result VectorResult repeated The results of processing the submitted audio. One result per audio addition.

VectorResult

The status of the vector request, and if successful includes an AudioVector

Field Type Label Description
successful bool required Whether or not the audio was successfully processed
message string optional Description of what occurred to cause an error
audio_vector AudioVector  optional If successful, the vectorized audio

ClassExportRequest

Exports an existing class enrollment (i.e. speaker enrollment) from the server for the specified class_id. The plugin must support the ClassExporter trait to handle this request.

Field Type Label Description
plugin string required The plugin
domain string required The domain
class_id string required The ID of the class model to export

ClassExportResult

The result of an enrollment export

Field Type Label Description
successful bool required Whether or not the individual audio succeeded
message string optional Description of what occurred to cause an error
enrollment EnrollmentModel optional If successful, then this is the exported model for the specified class.

ClassImportRequest

Used to import an enrollment model (exported via a ClassExportRequest). Only plugins that support the ClassExporter trait can handle this request. Only import an enrollment into the same plugin AND domain as previously exported.

Field Type Label Description
plugin string required The plugin
domain string required The domain
class_id string optional Import the model using this class name, instead of the original name
enrollment EnrollmentModel required the enrollment to import

ClassImportResult

The status of a ClassImportRequest.

Field Type Label Description
successful bool required Whether or not the import succeeded
message string optional Description of what occurred to cause an error

EnrollmentModel

An enrollment model for a specific plugin and domain. This is used to save a current enrollment or restore a class enrollment via a ClassImportRequest. This model is not used as an AudioVector in a scoring requests.

Field Type Label Description
plugin string required The origin plugin
domain string required The origin domain
class_id string required the class_id of the enrollment
data bytes required The enrollment model data
params Metadata repeated Name/value parameter data generated by the plugin and needed for later processing

Updater Messages

ApplyUpdateRequest

Used to request an update of a Plugin that supports the Update trait. Use GetUpdateStatusRequest to check if a plugin is ready for an update, otherwise this request is ignored by the server/plugin

Field Type Label Description
plugin string required The plugin to apply the update
domain string required The domain
params Metadata repeated Name/value options, plugin dependent

ApplyUpdateResult

This message is immediately returned after an ApplyUpdateRequest, as the updating process can take an extended time range to complete.

Field Type Label Description
successful bool required True if the plugin is being updated

GetUpdateStatusRequest

Used to request the status for a Plugin that supports the Update trait

Field Type Label Description
plugin string required The plugin
domain string required The domain

GetUpdateStatusResult

The result of a GetUpdateStatusRequest message.

Field Type Label Description
update_ready bool required True if the plugin has determiend it is ready for an update
last_update DateTime optional The date of the last update, if any
params Metadata repeated Zero or Metadata values describing the update status of the plugin. Metadata is a list of name/value pairs.

DateTime

Date and time info

Field Type Label Description
year uint32 required Year
month uint32 required Month
day uint32 required Day
hour uint32 required Hour
min uint32 required Minute
sec uint32 required Seconds

Learning Trait Messages

These messages are used by plugins that support adaptation and/or training.

PreprocessAudioAdaptRequest

Request preprocessing of this audio submission, which may be part of an adaptation set. Adaptation can be unsupervised (neither class_id, start_t, and end_t set) or supervised by setting class_id or class_id, start_t, and end_t. Adaptation should be finalized by calling either SupervisedAdaptationRequest or UnsupervisedAdaptationRequest. Plugins must support either the SupervisedAdapter or UnsupervisedAdapter trait to handle this request.

Field Type Label Description
plugin string required The plugin
domain string required The domain
audio Audio required The submitted audio
adapt_space string required A unique name for the client where pre-processed data is stored
class_id string optional The id of the class annotation being preprocessed (supervised training)
start_t uint32 optional Begin-time of the region (in seconds)
end_t uint32 optional End-time of the region (in seconds)

PreprocessAudioAdaptResult

The result of a PreprocessAudioAdaptRequest

Field Type Label Description
audio_id string required The ID of the preprocessed audio
duration double required The duration of the audio

Supervised Adapter Messages

SupervisedAdaptationRequest

Finalize adaptation of the specified plugin/domain using audio preprocessed using calls to PreprocessAudioAdaptRequest. The plugin must implement the SupervisedAdapter trait to handle this request.

When you adapt or train, you are creating a new domain for the target plugin, that is based on the domain passed in to the 'domain' field of this call. This new domain is specific to a plugin, so it is created within the plugin, and will be named with the string passed to SupervisedAdaptationRequest as 'new_domain'.

To actually use this new domain, future scoring or enrollment requests must specify this new domain name, instead of the original, using the value specified during adaptation as the 'new_domain' field. To have access to this new domain, either restart the server, or send a RemovePluginDomainRequest message to the server, which will force a reload of that plugin.

Upon successful completion of SupervisedAdaptationRequest, a SupervisedAdaptationResult message should be received, containing the path to the newly created domain on the server's file system.

The file sizes of the actual models will not change as a result of adaptation. Rather, the values stored inside these files will.

Field Type Label Description
plugin string required The plugin to invoke
domain string required The domain to adapt
new_domain string required the new domain name
class_annotations ClassAnnotation repeated The annotations to use for adaptation, audio annotations are created via PreprocessAudioAdaptRequest calls
adapt_space string required A unique name for the client where pre-processed data is stored

SupervisedAdaptationResult

Acknowledgment message that adaptation successfully completed. Informs the client of the full path of the new domain created by the SupervisedAdaptationRequest.

Field Type Label Description
new_domain  string required Confirmation of the new domain name

ClassAnnotation

Set of annotations for a class

Field Type Label Description
class_id string required The class ID (such as speaker name or language name)
annotations AudioAnnotation repeated the set of all audio annotations for this class.

AudioAnnotation

A set of audio annotations for a specific audio submission

Field Type Label Description
audio_id string required The audio ID returned in a PreprocessAudioAdaptResult or PreprocessAudioTrainResult message
regions AnnotationRegion repeated Set of annotations

Basic Types

Trait

A Trait implemented by a plugin

Field Type Label Description
type TraitType  optional The trait type
options OptionDefinition repeated Any options specific to this plugin's implementation of the trait

Plugin

The description of a plugin

Field Type Label Description
id string optional The id of the plugin
task string optional e.g. LID, SID, SAD, KWS, AED, etc.
label string optional Display label for plugin
desc string optional A brief description of how the plugin works/technologies it employs.
vendor string optional A brief description of how the plugin works/technologies it employs.
domain Domain repeated The domains owned by this plugin
trait Trait repeated The traits (capabilities) of this plugin

Domain

A description of a domain.

Field Type Label Description
id string optional  The ID of the domain
label string optional Display label for the domain
desc string optional A brief description of the domain conditions
class_id string  repeated The list of classes known to this domain

Envelope

Every message passed between the server and client is an instance of Envelope. An Envelope can contain multiple ScenicMessage instances, so it's important to iterate through them all when you receive an Envelope.

Field Type Label Description
message ScenicMessage repeated The messages to be sent
sender_id string required string description of the message sender

ScenicMessage

A ScenicMessage represents a single logical message between a client and server. It is placed within an Envelope. It contains nested messages in serialized form. The message_type field is used to determine the type of the nested data. Not all ScenicMessage instance will have message_data, and some may have multiple, but they will all be of the same type. It depends on the value of message_type.

Field Type Label Description
message_id string required id issued by client (and unique to that client) used to track a request. Any reply for that request will have the same id.
message_type MessageType  required type of message
message_data bytes repeated nested message data that can be deserialized according to message_type. Some messages do not have nested data, some have multiple records
error string optional error message; if present an error has occurred on the server
info string optional informational message, typically used to explain why message_data is empty but no error is reported

MessageType

The MessageType enum provides a value for each top-level SCENIC message. It is used within a ScenicMessage to indicate the type of the serialized message contained therein.

Name Number Description
PLUGIN_DIRECTORY_REQUEST 1
PLUGIN_DIRECTORY_RESULT 2
GLOBAL_SCORER_REQUEST 3
GLOBAL_SCORER_RESULT 4
REGION_SCORER_REQUEST 5
REGION_SCORER_RESULT 6
FRAME_SCORER_REQUEST 7
FRAME_SCORER_RESULT 8
CLASS_MODIFICATION_REQUEST 9
CLASS_MODIFICATION_RESULT 10
CLASS_REMOVAL_REQUEST 11
CLASS_REMOVAL_RESULT 12
GET_ACTIVE_REQUEST 13
GET_ACTIVE_RESULT 14
LOAD_PLUGIN_REQUEST 15
LOAD_PLUGIN_RESULT 16
GET_STATUS_REQUEST 17
GET_STATUS_RESULT 18
HEARTBEAT 19
PREPROCESS_AUDIO_TRAIN_REQUEST 20
PREPROCESS_AUDIO_TRAIN_RESULT 21
PREPROCESS_AUDIO_ADAPT_REQUEST 22
PREPROCESS_AUDIO_ADAPT_RESULT 23
SUPERVISED_TRAINING_REQUEST 24
SUPERVISED_TRAINING_RESULT 25
SUPERVISED_ADAPTATION_REQUEST 26
SUPERVISED_ADAPTATION_RESULT 27
UNSUPERVISED_ADAPTATION_REQUEST 28
UNSUPERVISED_ADAPTATION_RESULT 29
CLASS_ANNOTATION 30
AUDIO_ANNOTATION 31
ANNOTATION_REGION 32
REMOVE_PLUGIN_REQUEST 33
REMOVE_PLUGIN_RESULT 34
AUDIO_MODIFICATION_REQUEST 35
AUDIO_MODIFICATION_RESULT 36
PLUGIN_AUDIO_VECTOR_REQUEST 37
PLUGIN_AUDIO_VECTOR_RESULT 38
CLASS_EXPORT_REQUEST 39
CLASS_EXPORT_RESULT 40
CLASS_IMPORT_REQUEST 41
CLASS_IMPORT_RESULT 42
APPLY_UPDATE_REQUEST 43
APPLY_UPDATE_RESULT 44
GET_UPDATE_STATUS_REQUEST 45
GET_UPDATE_STATUS_RESULT 46
GLOBAL_COMPARER_REQUEST 47
GLOBAL_COMPARER_RESULT 48

| AUDIO_ALIGN_REQUEST | 68 | | | AUDIO_ALIGN_RESULT | 69 | | | TEXT_TRANSFORM_REQUEST | 70 | | | TEXT_TRANSFORM_RESULT | 71 | | | PREPROCESSED_AUDIO_RESULT | 72 | | | INVALID_MESSAGE | 73 | |

OptionType

Classifies how a OptionDefinition (TraitOption) should be represented in a UI widget

Name Number Description
BOOLEAN 
CHOICE 

TraitType

The list of possible traits that a plugin can implement

Name Number Description
GLOBAL_SCORER 1
REGION_SCORER 2
FRAME_SCORER 3
CLASS_ENROLLER 4
CLASS_MODIFIER 5
SUPERVISED_TRAINER 6
SUPERVISED_ADAPTER 7
UNSUPERVISED_ADAPTER 8
AUDIO_CONVERTER 9
AUDIO_VECTORIZER 10
CLASS_EXPORTER 11
UPDATER 12
LEARNING_TRAIT 13
GLOBAL_COMPARER 14
TEXT_TRANSFORMER 15
AUDIO_ALIGNMENT_SCORER 16

TaskType

Name Number Description
SAD 1 Speech Activity Detection
SID 2 Speaker ID
SDD 3 Speaker ID, but output in regions
LID 4 Language ID
LDD 5 Language ID, but output in regions
KWS 6 Keyword Spotting
TPD 7 Topic Detection
VTD 8 Voice Type Discrimination
GID 9 Gender ID
GDD 10 Gender ID, but output in regions
ASR 11 Automatic Speech Recognition
ENH 12 Audio Enhancement
CMP 13 Audio Comparison
SDD 14 Speaker Detection
DIA 15 Diarization
QBE 16 Query by Example
SHL 17 Speaker Highlighing
FID 18 Face ID
TMT 19 Text Machine Translation
ALN 20 Audio Alignment

OptionDefinition

A plugin TraitOption, describing how a plugin trait is used

Field Type Label Description
name string required The name/ID of the option
label string required Display label for the option
desc string optional A description of the option
type OptionType required The type of the option (boolean, choice/drop-down, etc)
choice string repeated Optional list of choices used by CHOICE type options
default string optional  The default option in the list of Options

OptionValue

A name/value property pair

Field Type Label Description
name string required The name/ID of the option
value string required The option value as a string

InputDataType

Workflow(?) Input Data Types

Name Number Description
AUDIO 1
VIDEO 2
TEXT 3
IMAGE 4

InputType

Name Number Description
FRAME 1
REGION 2

JobClass

Field Type Label Description
job_name string required The parent job name in a Workflow JobDefinition
task TaskClass repeated

TaskClass

Field Type Label Description
task_name string required The ID from the associated WorkflowTask (consumer_result_label)
class_id string repeated Zero or more class IDs available to this task. Some tasks do not support classes
class_label string optional An optional label/name to describe the classes used by this task such a 'speaker' or 'language'
classes_label string optional The speaker label when refering to plural classes, such as speakers, or languages

Shared Types

Audio

Represents an object. Can either refer to a local file or embed an audio buffer directly. The path and audioSamples fields should be treated as mutually exclusive, with one and only one of these fields implemented.

For more information regarding what types of audio are currently supported by the OLIVE server, see the Supported Audio Formats page.

Field Type Label Description
path string optional Path to the audio file represented by this record (if not specified then audio is input as a buffer)
audioSamples AudioBuffer optional Audio included as a buffer (if not specified, then path must be set)
selected_channel uint32 optional Optional - if using multi-channel audio and 'mode' is SELECTED, then this channel is provided to the plugin(s).
regions AnnotationRegion repeated Optional annotated regions for this audio
label string optional Optional - label used to identify this audio input

AnnotationRegion

A single pair of timestamps (start and end) that make up an annotated region. Timestamps are in seconds.

Field Type Label Description
start_t uint32 required Begin-time of the region (in seconds)
end_t uint32 required End-time of the region (in seconds)

AudioBuffer

Audio is contained in a buffer (and the path filed is NOT set in Audio) - by default the audio in the buffer should be PCM encoded, unless the buffer contains a serialized file (unencoded) in which case the serialized_file must be set to true. If the data has been decoded and is not PCM encoded data, then the encoding field must be specified

For more information regarding what types of audio are currently supported by the OLIVE server, see the Supported Audio Formats page.

Field Type Label Description
channels uint32 optional The number of channels contained in data, ignored for serialized buffers
samples uint32 optional The number of samples (in each channel), ignored for serialized buffers
rate uint32 optional The sample rate, ignored for serialized buffers
bit_depth AudioBitDepth optional The number of bits in each sample, ignored for serialized buffers
data bytes required Should be channels * samples long, striped by channels
serialized_file bool optional Optional - true if data contains a serialized buffer
encoding AudioEncodingType optional Optional - Not Yet supported - the audio encoding type. Assumed to be PCM if not specified. Ignored for serialized buffer

AudioEncodingType

Audio encoding types

Name Number Description
PCMU8
PCMS8 2
PCM16 3
PCM24 4
PCM32
FLOAT32 6
FLOAT64
ULAW 8
ALAW 9
IMA_ADPCM 10
MS_ADPCM  11
GSM610 12
G723_24  13
G721_32 14
DWW12 15
DWW16  16
DWW24  17
VORBIS 18
VOX_ADPCM  19
DPCM16  20
DPCM8  21

AudioBitDepth

Number of bits in each audio sample

Name Number Description
BIT_DEPTH_8 1
BIT_DEPTH_16
BIT_DEPTH_24 3
BIT_DEPTH_32 4

Metadata

The parent container for Metadata so that typed name/value properties can be transported in a generalized way

Field Type Label Description
type MetadataType required Indicates the type of this metadata, so it can be deserialized to the appropriate type
name string required The name (key) for this metadata
value bytes required The value is one of MetadataType, must be deserialized by the client into the type specified by type

MetadataType

Data types supported in an AudioModificationResult's metadata:

Name Number Description
STRING_META 1
INTEGER_META  2
DOUBLE_META 3
BOOLEAN_META 4
LIST_META 5

BooleanMetadata

Value as boolean

Field Type Label Description
value bool required

DoubleMetadata

Value as a double

Field Type Label Description
value double required

IntegerMetadata

Value as an integer

Field Type Label Description
value int32 required 

ListMetadata

Value as list of Metadata values

Field Type Label Description
type MetadataType repeated The type for the corresponding element
value bytes repeated The value is one or more MetadataType elements, each element must be deserialized by the client into the type specified by type. For example, for the type, STRING_META, deserialize data as StringMetadata

StringMetadata

Value as a string

Field Type Label Description
value string required

Scalar Value Types

.proto Type Notes C++ Type Java Type Python Type
double double double float
float float float float
int32 Uses variable-length encoding. Inefficient for encoding negative numbers – if your field is likely to have negative values, use sint32 instead. int32 int int
int64 Uses variable-length encoding. Inefficient for encoding negative numbers – if your field is likely to have negative values, use sint64 instead. int64 long int/long
uint32 Uses variable-length encoding. uint32 int int/long
uint64 Uses variable-length encoding. uint64 long int/long
sint32 Uses variable-length encoding. Signed int value. These more efficiently encode negative numbers than regular int32s. int32 int int
sint64 Uses variable-length encoding. Signed int value. These more efficiently encode negative numbers than regular int64s. int64 long int/long
fixed32 Always four bytes. More efficient than uint32 if values are often greater than 2^28. uint32 int int
fixed64 Always eight bytes. More efficient than uint64 if values are often greater than 2^56. uint64 long int/long
sfixed32 Always four bytes. int32 int int
sfixed64 Always eight bytes. int64 long int/long
bool bool boolean boolean
string A string must always contain UTF-8 encoded or 7-bit ASCII text. string String str/unicode
bytes May contain any arbitrary sequence of bytes. string ByteString str