Integrating with the OLIVE Java API (JOlive)
While the OLIVE Server allows direct client integration via Protobuf messages—from a variety of languages such as Java, Python, C++ and C# and operating systems such as Windows, Linux, and MacOS—a Java native API is also provided as a convenience. This page includes guidance for integrating a client application using the OLIVE Java API.
The sections below cover establishing a connection to an OLIVE Server, requesting available plugins, various ways to build and submit scoring and enrollment requests, and model adaptation. All instructions and code examples presented in this section are for the OLIVE Java API.
Fundamentally, all OLIVE API implementations are based on exchanging Protobuf messages between the client and OLIVE Server. These messages are defined in the API Message Reference Documentation. It's recommended that new integrators review the Enterprise API Primer for an overview of key concepts that also apply to the OLIVE Java API.
Distribution
The OLIVE Java API is distributed as a JAR file. It is located in the OLIVE delivery at the path api/java/repo/sri/speech/olive/api/olive-api/<version>/olive-api-<version>.jar
.
Dependencies
The OLIVE Java API dependencies include:
com.google.protobuf:protobuf-java:3.8.0
com.google.protobuf:protobuf-java-util:3.8.0
com.googlecode.json-simple:json-simple:1.1.1
org.json:json:20220320
org.zeromq:jeromq:0.5.2
org.slf4j:slf4j-api:1.7.30
ch.qos.logback:logback-core:1.2.3
ch.qos.logback:logback-classic:1.2.3
commons-lang:commons-lang:2.6
commons-io:commons-io:2.4
commons-cli:commons-cli:1.4
All dependencies are bundled with the OLIVE delivery in the directory api/java/repo
.
Integration
Quickstart
Here's a complete example for those in a hurry:
import java.util.ArrayList;
import java.util.List;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import com.sri.speech.olive.api.Olive;
import com.sri.speech.olive.api.Server;
import com.sri.speech.olive.api.utils.ClientUtils;
import com.sri.speech.olive.api.utils.ClientUtils.AudioTransferType;
import com.sri.speech.olive.api.utils.Pair;
public class Quickstart {
private static Logger log = LoggerFactory.getLogger(Quickstart.class);
private static final int TIMEOUT = 10000;
private static final String DEFAULT_SERVERNAME = "localhost";
private static final int DEFAULT_PORT = 5588;
// Create a callback to handle LID results from the server
private static final Server.ResultCallback<Olive.GlobalScorerRequest, Olive.GlobalScorerResult> rc = new Server.ResultCallback<Olive.GlobalScorerRequest, Olive.GlobalScorerResult>() {
@Override
public void call(Server.Result<Olive.GlobalScorerRequest, Olive.GlobalScorerResult> r) {
if (!r.hasError()) {
Olive.GlobalScore topScore = null;
for (Olive.GlobalScore gs : r.getRep().getScoreList()) {
if (topScore == null) {
topScore = gs;
} else {
if (gs.getScore() > topScore.getScore()) {
topScore = gs;
}
}
}
log.info("Top scoring: {} = {}", topScore.getClassId(), topScore.getScore());
} else {
log.info(String.format("Global scorer error: {%s}", r.getError()));
}
System.exit(0);
}
};
public static void main(String[] args) throws Exception {
// audio file name is passed as an argument
String audioFileName = args[0];
// Setup the connection to the OLIVE server
Server server = new Server();
server.connect("exampleClient", DEFAULT_SERVERNAME,
DEFAULT_PORT,
DEFAULT_PORT + 1,
TIMEOUT);
// wait for the connection
long start_t = System.currentTimeMillis();
while (!server.getConnected().get() && System.currentTimeMillis() - start_t < TIMEOUT) {
try {
synchronized (server.getConnected()) {
server.getConnected().wait(TIMEOUT);
}
} catch (InterruptedException e) {
// Keep waiting
}
}
// exit if cannot connect
if (!server.getConnected().get()) {
log.error("Unable to connect to the OLIVE server: {}", DEFAULT_SERVERNAME);
System.exit(1);
}
// find a Language ID (LID) plugin
List<Pair<Olive.Plugin, Olive.Domain>> pluginList = ClientUtils.requestPlugins(server);
Pair<Olive.Plugin, Olive.Domain> pd = null;
for (Pair<Olive.Plugin, Olive.Domain> pair : pluginList) {
if ("LID".equalsIgnoreCase(pair.getFirst().getTask())) {
pd = pair;
break;
}
}
// run Language ID (LID) on serialized audio file
if (pd != null) {
ClientUtils.requestGlobalScore(server, pd, Olive.TraitType.GLOBAL_SCORER, audioFileName, 1, rc, true,
AudioTransferType.SEND_SERIALIZED_BUFFER, new ArrayList<>(), new ArrayList<>(), new ArrayList<>());
} else {
log.error("Could not retrieve a LID plugin!");
System.exit(1);
}
}
}
The code above accepts a filepath command line argument, connects to an OLIVE Server, programmatically locates a Language Identification (LID) plugin, submits a serialized audio file for LID analysis, and finally prints out the top scoring language.
Establish Server Connection
Before making any request, a client must establish a connection with the server. By default, the OLIVE server listens on ports 5588
(request port) and 5589
(status port) for client connection and status requests. These ports are configurable, but if the server has not been instructed to change its listening ports, the code below should establish a connection.
A connection to the server can be established with a call to com.sri.speech.olive.api.Server#connect()
, as shown below:
Server server = new Server();
server.connect(
"exampleClient", //client-id
"localhost", //address of server
5588, //request-port
5589, //status-port
10000 //timeout for failed connection request
);
The request port (5588 by default) is used for request and response messages (Protobuf messages). Each request message sent to this port is guaranteed a response from the server. Messages in the API Message Reference are often suffixed with 'Request' and 'Result' to denote whether it's a request or result message. There is no need to poll the server for information on a submitted request, as the result/response for the a request is returned to the client as soon as it is available.
The status port (5589 by default) is used by the Server to publish health and status messages (Heartbeat) to client(s). Clients can not send requests on this port.
Request Available Plugins
In order to submit most server requests, the client must specify the plugin and domain pair to use for the request (i.e. Pair<Olive.Plugin, Olive.Domain>
). The function requestPlugins()
provided by the ClientUtils
class can be used to get all the plugin/domain pairs available:
// ask the server for a list of currently available plugins
List<Pair<Olive.Plugin, Olive.Domain>> pluginList = ClientUtils.requestPlugins(server);
The pairs can be iterated through looking for specific criteria of interest.
// iterate through and find a plugin and domain to use for SAD
Pair<Olive.Plugin, Olive.Domain> pd = null;
for(Pair<Olive.Plugin, Olive.Domain> pair : pluginList) {
if ("SAD".equals(pair.getFirst().getTask())) {
pd = pair;
break;
}
}
if (pd != null) {
log.info("{}/{} supports SAD!", pd.getFirst().getId(), pd.getSecond().getLabel());
} else {
log.info("No SAD plugin found!");
}
Alternatively, if the client already knows the desired plugin ID and domain name, the pair reference can be obtained using the findPluginDomain()
provided by the ClientUtils
class.
String pluginName = "sad-dnn-v8.0.0";
String domainName = "multi-v1";
// Look up a specific plugin by ID and domain
pd = ClientUtils.findPluginDomain(pluginName, domainName, pluginList);
if (pd != null) {
log.info("{}/{} was found!", pd.getFirst().getId(), pd.getSecond().getLabel());
} else {
log.info("{}/{} was NOT found!", pluginName, domainName);
}
Audio Submission Guidelines
One of the core client activities is submitting Audio with a request. In the OLIVE Java API, three ways are provided for a client to send audio data to the OLIVE server:
- file path
- buffer of raw audio sample data
- serialized file buffer object
The enum ClientUtils.AudioTransferType
is used to specify which of the three transfer mechanisms to use. In almost every case, the OLIVE Java API handles packaging up audio appropriately based on the passed in AudioTransferType
so the client only needs to understand a few basic related to the transfer type, and they are explained below.
When the client and the OLIVE server share the same file system, the easiest way for the client to send audio data to the server is by specifying the audio's file path on disk. The OLIVE Java API provides the utility below to package audio files which are accessible to the server locally:
AudioTransferType.SEND_AS_PATH
When the client and the server don't share the same file system, as in the case of a client making a remote connection to the OLIVE server, it is necessary to send the client's local audio files as a file buffer. To help package the client's audio data in a raw buffer, the OLIVE Java API provides the utility below:
AudioTransferType.SEND_SAMPLES_BUFFER
When the client wants to ensure that all audio header information is provided intact to the server, the OLIVE Java API provides the utility below:
AudioTransferType.SEND_SERIALIZED_BUFFER
Note on 'serialized buffer' usage.
This transfer mechanism passes the original file to the server in its entirety in one contiguous buffer, leaving the audio file header intact. This allows the server to properly decode and process the audio once its received, since it can directly access the bit depth, encoding type, sample rate and other necessary information from the header itself. The tradeoff with serialized files is that there may be additional overhead needed to process the audio into a consumable form. If the client and server reside on the same hardware and file system, it is advisable to simply pass filepaths when possible. This saves the memory overhead burden of both the client and server loading audio into memory. If using common audio types, like 16-bit PCM .wav files, it may also be possible to simply pass a non-serialized file buffer.
Synchronous vs. Asynchronous Message Submission
The OLIVE Java API allows the client to choose between processing a task request synchronously or asynchronously. Processing a task request synchronously means the client will block and wait for the task result to return before proceeding to other task requests. On the other hand, asynchronous processing means the client will not wait for the result to come back before moving on, allowing several jobs to be submitted in parallel. The examples below generally show submitting requests asynchronously.
The argument async
is used in many ClientUtils
functions (requestFrameScore()
, requestGlobalScore()
, requestRegionScores()
, requestEnrollClass()
, etc.), and can be used to choose whether the client intends to wait for the task result to return or not. When async
is set to true
, the client will not block when a request is sent to the server, so other task requests can be made before the results are received asynchronously and handled by the callback.
OLIVE Java API Code Samples
The OLIVE Java API includes functionality to accommodate many of the available request messages. Utilities such as requestFrameScore()
, requestRegionScores()
, requestEnrollClass()
, etc. not only do the packaging of request messages, but also take care of sending the request messages to the server, all in one call. They are available from the class com.sri.speech.olive.api.utils.ClientUtils
.
The required parameters to send many of these requests include:
- server handle (server - see here)
- the plugin handle (pd - see here)
- the name of the audio file to submit to the server (filename)
- channel number of the audio to be processed when audio has more than 1 channel (channelNumber)
- a callback function for handling results returned either asynchronously or synchronously (rc)
- whether the client will block for task result to return (async - see here)
- an enum of how to submit audio to the server (transferType)
- optional lists of annotations of the submitted audio (regions)
- optional list of parameters for customizing plugin behavior (options)
- optional list of class IDs for filtering the results (classIDs)
Performing an enrollment request adds an additional parameter:
- the ID of the class to be enrolled
The primary requests covered below are:
- Frame score requests - used to make some SAD requests (note that some SAD plugins return regions, not frame scores - most are capable of performing both)
- Global score requests - used to make LID, SID, or GID score requests
- Region score requests - used to make ASR, SDD, LDD, QBE, and often SAD score requests
- Enrollment requests - used to enroll speakers or other class types for plugins that support the ClassEnroller trait
Frame Score Request
The example below provides sample code for a function handleMySADFrameScorerRequest
that takes a server connection, a plugin/domain handle, and a path to an audio file as arguments, and uses this information to build and submit a Frame scoring request to the connected server.
public static boolean handleMySADFrameScorerRequest(Server server, Pair<Olive.Plugin, Olive.Domain> pd, String filename)
throws ClientException, IOException, UnsupportedAudioFileException {
if (null == pd) {
return false;
}
// Create a callback to handle SAD results from the server
Server.ResultCallback<Olive.FrameScorerRequest, Olive.FrameScorerResult> rc = new Server.ResultCallback<Olive.FrameScorerRequest, Olive.FrameScorerResult>() {
@Override
public void call(Server.Result<Olive.FrameScorerRequest, Olive.FrameScorerResult> r) {
// output frame scores
if (!r.hasError()) {
for (Olive.FrameScores fs : r.getRep().getResultList()) {
log.info(String.format("Received %d frame scores for '%s'", fs.getScoreCount(),
fs.getClassId()));
Double[] scores = fs.getScoreList().toArray(new Double[fs.getScoreList().size()]);
int rate = fs.getFrameRate();
for (int i = 0; i < scores.length; i++) {
if (scores[i] > 0.0) {
int start = (int) (100 * i / (double) rate);
int end = (int) (100 * (i + 1) / (double) rate);
log.info(String.format("start: '%d' end: '%d' score:'%f'", start, end, scores[i]));
}
}
}
}
System.exit(0);
}
};
return ClientUtils.requestFrameScore(server, pd, filename, 1, rc, true,
AudioTransferType.SEND_SERIALIZED_BUFFER, new ArrayList<>(), new ArrayList<>(), new ArrayList<>());
}
This code passes the audio to the server using a serialized buffer. It is also possible to perform this request using buffered audio samples or a file path.
Only a plugin that support the FrameScorer trait can handle this request. All SAD plugins support FrameScorer, while some also SAD plugins also support the RegionScorer trait.
The method signature for ClientUtils.requestFrameScore
is:
public static boolean requestFrameScore(Server server,
Pair<Olive.Plugin, Olive.Domain> pd,
String filename,
int channelNumber,
Server.ResultCallback<Olive.FrameScorerRequest, Olive.FrameScorerResult> rc,
boolean async,
AudioTransferType transferType,
List<RegionWord> regions,
List<Pair<String, String>> options,
List<String> classIDs) throws ClientException, IOException, UnsupportedAudioFileException
For a complete example of how to call this code with a specific plugin, refer to the SAD Scoring Request code example below. It contains the full Java code file this example was pulled from, showing the process of establishing a server connection, polling the server for available plugins to retrieve the appropriate plugin/domain handle, and making the request.
Global Score Request
The example below provides sample code for a function handleMyLIDGlobalScorerRequest
that takes a server connection, a plugin/domain handle, and a path to an audio file as arguments, and uses this information to build and submit a Global scoring request to the connected server.
public static boolean handleMyLIDGlobalScorerRequest(Server server, Pair<Olive.Plugin, Olive.Domain> pd, String filename)
throws ClientException, IOException, UnsupportedAudioFileException {
if (null == pd) {
return false;
}
// Create a callback to handle LID results from the server
Server.ResultCallback<Olive.GlobalScorerRequest, Olive.GlobalScorerResult> rc = new Server.ResultCallback<Olive.GlobalScorerRequest, Olive.GlobalScorerResult>() {
@Override
public void call(Server.Result<Olive.GlobalScorerRequest, Olive.GlobalScorerResult> r) {
// output LID global scores
if (!r.hasError()) {
log.info("Received {} scores:", r.getRep().getScoreCount());
for (Olive.GlobalScore gs : r.getRep().getScoreList()) {
log.info(String.format("%s = %f", gs.getClassId(), gs.getScore()));
}
} else {
log.info(String.format("Global scorer error: {%s}", r.getError()));
}
System.exit(0);
}
};
return ClientUtils.requestGlobalScore(server, pd, Olive.TraitType.GLOBAL_SCORER, filename, 1, rc, true,
AudioTransferType.SEND_SERIALIZED_BUFFER, new ArrayList<>(), new ArrayList<>(), new ArrayList<>());
}
This code passes the audio to the server using a serialized buffer. It is also possible to perform this request using buffered audio samples or a file path.
The code required to submit a GlobalScorerRequest message doesn't care what type of plugin is going to be doing the scoring, as long as the plugin implements the GlobalScorer Trait. This means that the exact same code can be used for submitting audio to global scoring LID plugins, SID plugins, or any other global scoring plugin.
The method signature for ClientUtils.requestGlobalScore
is:
public static boolean requestGlobalScore(Server server,
Pair<Olive.Plugin, Olive.Domain> plugin,
Olive.TraitType trait,
String filename,
int channelNumber,
Server.ResultCallback<Olive.GlobalScorerRequest, Olive.GlobalScorerResult> rc,
boolean async, AudioTransferType transferType,
List<RegionWord> regions,
List<Pair<String, String>> options,
List<String> classIDs) throws ClientException, IOException
Region Score Request
The example below provides sample code for a function handleMyRegionScorerRequest
that takes a server connection, a plugin/domain handle, and a path to an audio file as arguments, and uses this information to build and submit a Region scoring request to the connected server.
public static boolean handleMyRegionScorerRequest(Server server, Pair<Olive.Plugin, Olive.Domain> pd, String filename)
throws ClientException, IOException, UnsupportedAudioFileException {
if (null == pd) {
return false;
}
// Create a callback to handle results from the server
Server.ResultCallback<Olive.RegionScorerRequest, Olive.RegionScorerResult> rc = new Server.ResultCallback<Olive.RegionScorerRequest, Olive.RegionScorerResult>() {
@Override
public void call(Server.Result<Olive.RegionScorerRequest, Olive.RegionScorerResult> r) {
// do something with the results:
if (!r.hasError()) {
log.info("Received {} region scores:", r.getRep().getRegionCount());
for (Olive.RegionScore rs : r.getRep().getRegionList()) {
log.info("{} ({}-{}secs, score={})", rs.getClassId(), rs.getScore(), rs.getStartT(),
rs.getEndT());
}
} else {
log.error("Region scoring error: {}", r.getError());
}
System.exit(0);
}
};
return ClientUtils.requestRegionScores(server, pd, filename, 0, rc, true, AudioTransferType.SEND_SERIALIZED_BUFFER, new ArrayList<>(),
new ArrayList<>(), new ArrayList<>());
}
This code passes the audio to the server using a serialized buffer. It is also possible to perform this request using file path or a serialized file.
The method signature for ClientUtils.requestRegionScore
is:
public static boolean requestRegionScores(Server server, Pair<Olive.Plugin, Olive.Domain> plugin,
String filename,
int channelNumber,
Server.ResultCallback<Olive.RegionScorerRequest, Olive.RegionScorerResult> rc,
boolean async,
AudioTransferType transferType,
List<RegionWord> regions,
List<Pair<String, String>> options,
List<String> classIDs) throws ClientException, IOException
For a complete example of how to call this code with a specific plugin, refer to the ASR Scoring Request code example below. It contains the full Java code file this example was pulled from, showing the process of establishing a server connection, polling the server to retrieve the appropriate plugin/domain handle, and making the request.
Enrollment Request
The example below provides sample code for a function handleMyEnrollmentRequest
that takes a server connection, a plugin/domain handle, the name of the class (speaker) to enroll, and a path to an audio file as arguments, and uses this information to build and submit a Region Scoring request to the connected server.
public static boolean handleMyEnrollmentRequest(Server server, Pair<Olive.Plugin, Olive.Domain> pd,
String speakerName, String enrollmentFileName)
throws ClientException, IOException, UnsupportedAudioFileException {
if (null == pd) {
return false;
}
// create a callback that handles the enrollment result from the server
Server.ResultCallback<Olive.ClassModificationRequest, Olive.ClassModificationResult> enrollmentCallback = new Server.ResultCallback<Olive.ClassModificationRequest, Olive.ClassModificationResult>() {
@Override
public void call(Server.Result<Olive.ClassModificationRequest, Olive.ClassModificationResult> r) {
// examine enrollment result
if (!r.hasError()) {
log.info("Enrollment succeeded");
} else {
log.error("Enrollment request failed: {}", r.getError());
}
}
};
// make it a synchronized call, so we know the speaker is enrolled before we
boolean enrolled = ClientUtils.requestEnrollClass(server, pd, speakerName, enrollmentFileName, 0,
enrollmentCallback,
false, ClientUtils.DataType.AUDIO_DATA, AudioTransferType.SEND_SERIALIZED_BUFFER, new ArrayList<>(),
enrollmentOptions);
return enrolled;
}
The method signature for ClientUtils.requestEnrollClass
is:
public static boolean requestEnrollClass(Server server,
Pair<Olive.Plugin, Olive.Domain> pd,
String id,
String wavePath,
int channelNumber,
Server.ResultCallback<Olive.ClassModificationRequest, Olive.ClassModificationResult> rc,
boolean async,
DataType dataType,
AudioTransferType transferType,
List<RegionWord> regions,
List<Pair<String, String>> options) throws ClientException
Plugin Specific Code Examples
This section shows examples of using the functions just outlined to make calls to specific plugins, and demonstrate how the same code can be reused for several purposes - for example, requestGlobalScore
is valid to request scoring from both SID and LID plugins.
- SAD Scoring Example
- SID Enrollment and Scoring Example
- LID Enrollment and Scoring Example
- ASR Scoring Example
- TMT Scoring Example
- SAD Adaptation Example
SAD Scoring Request
This shows a full implementation of a client program which sends a frame scoring request to a SAD plugin. Upon return of the result, it outputs the received frame scores. Included also is a second version where a threshold is used to filter out frame scores which are higher.
import com.sri.speech.olive.api.Olive;
import com.sri.speech.olive.api.Server;
import com.sri.speech.olive.api.utils.*;
import com.sri.speech.olive.api.utils.ClientUtils.AudioTransferType;
import com.sri.speech.olive.api.client.*;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import javax.sound.sampled.UnsupportedAudioFileException;
import java.io.*;
import java.util.*;
public class MySADFrameScorer {
private static Logger log = LoggerFactory.getLogger(MySADFrameScorer.class);
private static final int TIMEOUT = 10000;
private static final String DEFAULT_SERVERNAME = "localhost";
private static final int DEFAULT_PORT = 5588;
private static String domainName = "multi-v1";
private static String pluginName = "sad-dnn-v8.0.0";
public static void main(String[] args) throws Exception {
// audio file name is passed as an argument
String audioFileName = args[0];
// Setup the connection to the OLIVE server
Server server = new Server();
server.connect("exampleClient", DEFAULT_SERVERNAME,
DEFAULT_PORT,
DEFAULT_PORT + 1,
TIMEOUT); // may need to adjust timeout
// wait for the connection
long start_t = System.currentTimeMillis();
while (!server.getConnected().get() && System.currentTimeMillis() - start_t < TIMEOUT) {
try {
synchronized (server.getConnected()) {
server.getConnected().wait(TIMEOUT);
}
} catch (InterruptedException e) {
// Keep waiting
}
}
// report if connection fails
if (!server.getConnected().get()) {
log.error("Unable to connect to the OLIVE server: {}", DEFAULT_SERVERNAME);
throw new Exception("Unable to connect to server");
}
// ask the server for a list of current plugins
List<Pair<Olive.Plugin, Olive.Domain>> pluginList = ClientUtils.requestPlugins(server);
// from the list of plugins, find the targeted plugin for the task
Pair<Olive.Plugin, Olive.Domain> pd = ClientUtils.findPluginDomainByTrait(pluginName, domainName, "SAD",
Olive.TraitType.FRAME_SCORER, pluginList);
// formulate SAD frame scoring request and send to server
handleMySADFrameScorerRequest(server, pd, audioFileName);
}
public static boolean handleMySADFrameScorerRequest(Server server, Pair<Olive.Plugin, Olive.Domain> pd, String filename)
throws ClientException, IOException, UnsupportedAudioFileException {
if (null == pd) {
return false;
}
// Create a callback to handle SAD results from the server
Server.ResultCallback<Olive.FrameScorerRequest, Olive.FrameScorerResult> rc = new Server.ResultCallback<Olive.FrameScorerRequest, Olive.FrameScorerResult>() {
@Override
public void call(Server.Result<Olive.FrameScorerRequest, Olive.FrameScorerResult> r) {
// output frame scores
if (!r.hasError()) {
for (Olive.FrameScores fs : r.getRep().getResultList()) {
log.info(String.format("Received %d frame scores for '%s'", fs.getScoreCount(),
fs.getClassId()));
Double[] scores = fs.getScoreList().toArray(new Double[fs.getScoreList().size()]);
int rate = fs.getFrameRate();
for (int i = 0; i < scores.length; i++) {
int start = (int) (100 * i / (double) rate);
int end = (int) (100 * (i + 1) / (double) rate);
log.info(String.format("start: '%d' end: '%d' score:'%f'", start, end, scores[i]));
}
}
}
System.exit(0);
}
};
return ClientUtils.requestFrameScore(server, pd, filename, 1, rc, true,
AudioTransferType.SEND_SERIALIZED_BUFFER, new ArrayList<>(), new ArrayList<>(), new ArrayList<>());
}
}
The code above outputs all frame scores of the input audio, this can generate massive amount of output, especially when the audio is long. One good way to trim down the output is to filter out regions with frame scores higher than a preset threshold value. The following shows how this can be done using a threshold of 0.0.
...
for (int i = 0; i < scores.length; i++) {
if (scores[i] > 0.0) { // only print with score 0.0 or greater!!!
int start = (int) (100 * i / (double) rate);
int end = (int) (100 * (i + 1) / (double) rate);
log.info(String.format("start: '%d' end: '%d' score:'%f'", start, end, scores[i]));
}
}
...
SID Enrollment and Scoring Request
This example is a full implementation of a client program which sends an enrollment request to a SID plugin, followed by a scoring request to the same SID plugin.
import com.sri.speech.olive.api.Olive;
import com.sri.speech.olive.api.Server;
import com.sri.speech.olive.api.utils.*;
import com.sri.speech.olive.api.utils.ClientUtils.AudioTransferType;
import com.sri.speech.olive.api.utils.parser.RegionParser;
import com.sri.speech.olive.api.client.*;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import javax.sound.sampled.UnsupportedAudioFileException;
import java.io.*;
import java.util.*;
public class MySIDEnrollmentAndScore {
private static Logger log = LoggerFactory.getLogger(MySIDEnrollmentAndScore.class);
private static final int TIMEOUT = 10000;
private static final String DEFAULT_SERVERNAME = "localhost";
private static final int DEFAULT_PORT = 5588;
private static String domainName = "multi-v1";
private static String pluginName = "sid-dplda-v3.0.0";
private static String speakerName = "EDMUND_YAO";
private static List<Pair<String, String>> enrollmentOptions = new ArrayList<>();
private static RegionParser regionParser = new RegionParser();
public static void main(String[] args) throws Exception {
// enrollment file name is passed in as an argument
String enrollmentFileName = args[0];
String scoreWaveFileName = args[1];
// Setup the connection to the OLIVE server
Server server = new Server();
server.connect("exampleClient", DEFAULT_SERVERNAME,
DEFAULT_PORT,
DEFAULT_PORT + 1,
TIMEOUT); // may need to adjust timeout
// wait for the connection
long start_t = System.currentTimeMillis();
while (!server.getConnected().get() && System.currentTimeMillis() - start_t < TIMEOUT) {
try {
synchronized (server.getConnected()) {
server.getConnected().wait(TIMEOUT);
}
} catch (InterruptedException e) {
// Keep waiting
}
}
// report if connection fails
if (!server.getConnected().get()) {
log.error("Unable to connect to the OLIVE server: {}", DEFAULT_SERVERNAME);
throw new Exception("Unable to connect to server");
}
// ask the server for a list of current plugins
List<Pair<Olive.Plugin, Olive.Domain>> pluginList = ClientUtils.requestPlugins(server);
// obtain SID plugin handle
Pair<Olive.Plugin, Olive.Domain> pd = ClientUtils.findPluginDomainByTrait(pluginName, domainName, null,
Olive.TraitType.CLASS_ENROLLER, pluginList);
// perform SID enrollment task
boolean enrolled = handleMyEnrollmentRequest(server, pd, speakerName, enrollmentFileName);
if (enrolled) {
// make a region score request
handleMyRegionScoreRequest(server, pd, speakerName, scoreWaveFileName);
}
}
public static boolean handleMyEnrollmentRequest(Server server, Pair<Olive.Plugin, Olive.Domain> pd,
String speakerName, String enrollmentFileName)
throws ClientException, IOException, UnsupportedAudioFileException {
if (null == pd) {
return false;
}
// create a callback that handles the enrollment result from the server
Server.ResultCallback<Olive.ClassModificationRequest, Olive.ClassModificationResult> enrollmentCallback = new Server.ResultCallback<Olive.ClassModificationRequest, Olive.ClassModificationResult>() {
@Override
public void call(Server.Result<Olive.ClassModificationRequest, Olive.ClassModificationResult> r) {
// examine enrollment result
if (!r.hasError()) {
log.info("Enrollment succeeded");
} else {
log.error("Enrollment request failed: {}", r.getError());
}
}
};
// make it a synchronized call, so we know the speaker is enrolled before we
boolean enrolled = ClientUtils.requestEnrollClass(server, pd, speakerName, enrollmentFileName, 0,
enrollmentCallback,
false, ClientUtils.DataType.AUDIO_DATA, AudioTransferType.SEND_SERIALIZED_BUFFER, new ArrayList<>(),
enrollmentOptions);
return enrolled;
}
public static void handleMyRegionScoreRequest(Server server, Pair<Olive.Plugin, Olive.Domain> pd,
String speakerName, String scoreWaveFileName)
throws ClientException, IOException, UnsupportedAudioFileException {
// Create a call back to handle the SID scoring request
Server.ResultCallback<Olive.GlobalScorerRequest, Olive.GlobalScorerResult> scoreCallback = new Server.ResultCallback<Olive.GlobalScorerRequest, Olive.GlobalScorerResult>() {
@Override
public void call(Server.Result<Olive.GlobalScorerRequest, Olive.GlobalScorerResult> r) {
// do something with the results:
if (!r.hasError()) {
log.info("Received {} scores:", r.getRep().getScoreCount());
for (Olive.GlobalScore gs : r.getRep().getScoreList()) {
log.info(String.format("speaker{%s} = {%f}", gs.getClassId(), gs.getScore()));
}
} else {
log.error(String.format("Global scorer error: {%s}", r.getError()));
}
System.exit(0);
}
};
// SID is a global scorer, so make a global score reqeust:
ClientUtils.requestGlobalScore(server, pd, Olive.TraitType.GLOBAL_SCORER, scoreWaveFileName, 0,
scoreCallback, true, AudioTransferType.SEND_SERIALIZED_BUFFER,
regionParser.getRegions(scoreWaveFileName), new ArrayList<>(), new ArrayList<>());
}
}
LID Enrollment and Scoring Request
This example is a full implementation of a client program which sends an enrollment request to a LID plugin, followed by a scoring request to the same LID plugin.
import com.sri.speech.olive.api.Olive;
import com.sri.speech.olive.api.Server;
import com.sri.speech.olive.api.utils.*;
import com.sri.speech.olive.api.utils.ClientUtils.AudioTransferType;
import com.sri.speech.olive.api.client.*;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import javax.sound.sampled.UnsupportedAudioFileException;
import java.io.*;
import java.util.*;
public class MyLIDEnrollmentAndScore {
private static final Logger log = LoggerFactory.getLogger(MyLIDEnrollmentAndScore.class);
private static final int TIMEOUT = 10000;
private static final String DEFAULT_SERVERNAME = "localhost";
private static final int DEFAULT_PORT = 5588;
private static String domainName = "multi-v1";
private static String pluginName = "lid-embedplda-v4.0.0";
private static String languageName = "Esperanto";
private static List<Pair<String, String>> enrollmentOptions = new ArrayList<>();
public static void main(String[] args) throws Exception {
// enrollment file name is passed in as an argument
String enrollmentFileName = args[0];
String scoreWaveFileName = args[1];
// Setup the connection to the OLIVE server
Server server = new Server();
server.connect("exampleClient", DEFAULT_SERVERNAME,
DEFAULT_PORT,
DEFAULT_PORT + 1,
TIMEOUT); // may need to adjust timeout
// wait for the connection
long start_t = System.currentTimeMillis();
while (!server.getConnected().get() && System.currentTimeMillis() - start_t < TIMEOUT) {
try {
synchronized (server.getConnected()) {
server.getConnected().wait(TIMEOUT);
}
} catch (InterruptedException e) {
// Keep waiting
}
}
// report if connection fails
if (!server.getConnected().get()) {
log.error("Unable to connect to the OLIVE server: {}", DEFAULT_SERVERNAME);
throw new Exception("Unable to connect to server");
}
// ask the server for a list of current plugins
List<Pair<Olive.Plugin, Olive.Domain>> pluginList = ClientUtils.requestPlugins(server);
// obtain LID plugin handle
Pair<Olive.Plugin, Olive.Domain> pd = ClientUtils.findPluginDomainByTrait(pluginName, domainName, null,
Olive.TraitType.CLASS_ENROLLER, pluginList);
// perform LID enrollment task
boolean enrolled = handleMyEnrollmentRequest(server, pd, languageName, enrollmentFileName);
if (enrolled) {
// make a region score request
handleMyRegionScoreRequest(server, pd, languageName, scoreWaveFileName);
}
}
public static boolean handleMyEnrollmentRequest(Server server, Pair<Olive.Plugin, Olive.Domain> pd,
String languageName, String enrollmentFileName)
throws ClientException, IOException, UnsupportedAudioFileException {
if (null == pd) {
return false;
}
// create a callback that handles the enrollment result from the server
Server.ResultCallback<Olive.ClassModificationRequest, Olive.ClassModificationResult> enrollmentCallback = new Server.ResultCallback<Olive.ClassModificationRequest, Olive.ClassModificationResult>() {
@Override
public void call(Server.Result<Olive.ClassModificationRequest, Olive.ClassModificationResult> r) {
// examine enrollment result
if (!r.hasError()) {
log.info("Enrollment succeeded");
} else {
log.error("Enrollment request failed: {}", r.getError());
}
}
};
// make it a synchronized call, so we know the language is enrolled before we
boolean enrolled = ClientUtils.requestEnrollClass(server, pd, languageName, enrollmentFileName, 0,
enrollmentCallback,
false, ClientUtils.DataType.AUDIO_DATA, AudioTransferType.SEND_SERIALIZED_BUFFER, new ArrayList<>(),
enrollmentOptions);
return enrolled;
}
public static void handleMyRegionScoreRequest(Server server, Pair<Olive.Plugin, Olive.Domain> pd,
String languageName, String scoreWaveFileName)
throws ClientException, IOException, UnsupportedAudioFileException {
// Create a call back to handle the LID scoring request
Server.ResultCallback<Olive.GlobalScorerRequest, Olive.GlobalScorerResult> scoreCallback = new Server.ResultCallback<Olive.GlobalScorerRequest, Olive.GlobalScorerResult>() {
@Override
public void call(Server.Result<Olive.GlobalScorerRequest, Olive.GlobalScorerResult> r) {
// do something with the results:
if (!r.hasError()) {
log.info("Received {} scores:", r.getRep().getScoreCount());
for (Olive.GlobalScore gs : r.getRep().getScoreList()) {
log.info(String.format("language{%s} = {%f}", gs.getClassId(), gs.getScore()));
}
} else {
log.error(String.format("Global scorer error: {%s}", r.getError()));
}
System.exit(0);
}
};
// LID is a global scorer, so make a global score reqeust:
ClientUtils.requestGlobalScore(server, pd, Olive.TraitType.GLOBAL_SCORER, scoreWaveFileName, 0,
scoreCallback, true, AudioTransferType.SEND_SERIALIZED_BUFFER,
new ArrayList<>(), new ArrayList<>(), new ArrayList<>());
}
}
ASR Scoring Request
The following example shows a full implementation of a ASR scoring request. It sends a RegionScorerRequest
and receives a RegionScorerResult
.
import com.sri.speech.olive.api.Olive;
import com.sri.speech.olive.api.Server;
import com.sri.speech.olive.api.utils.*;
import com.sri.speech.olive.api.utils.ClientUtils.AudioTransferType;
import com.sri.speech.olive.api.client.*;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import javax.sound.sampled.UnsupportedAudioFileException;
import java.io.*;
import java.util.*;
public class MyASRRegionScorer {
private static Logger log = LoggerFactory.getLogger(MyASRRegionScorer.class);
private static final int TIMEOUT = 10000;
private static final String DEFAULT_SERVERNAME = "localhost";
private static final int DEFAULT_PORT = 5588;
private static String domainName = "english-tdnnLookaheadRnnlm-tel-v2";
private static String pluginName = "asr-dynapy-v4.1.0";
public static void main(String[] args) throws Exception {
// audio file name is passed as an argument
String audioFileName = args[0];
// Setup the connection to the (scenic) server
Server server = new Server();
server.connect("exampleClient", DEFAULT_SERVERNAME,
DEFAULT_PORT,
DEFAULT_PORT + 1,
TIMEOUT);
// wait for the connection
long start_t = System.currentTimeMillis();
while (!server.getConnected().get() && System.currentTimeMillis() - start_t < TIMEOUT) {
try {
synchronized (server.getConnected()) {
server.getConnected().wait(TIMEOUT);
}
} catch (InterruptedException e) {
// Keep waiting
}
}
// report if connection fails
if (!server.getConnected().get()) {
log.error("Unable to connect to the OLIVE server: {}", DEFAULT_SERVERNAME);
throw new Exception("Unable to connect to server");
}
// ask the server for a list of current plugins
List<Pair<Olive.Plugin, Olive.Domain>> pluginList = ClientUtils.requestPlugins(server);
// formulate the frame scoring task request
Pair<Olive.Plugin, Olive.Domain> pd = ClientUtils.findPluginDomainByTrait(pluginName, domainName, "ASR",
Olive.TraitType.REGION_SCORER, pluginList);
// Perform ASR frame scoring task
handleMyRegionScorerRequest(server, pd, audioFileName);
}
public static boolean handleMyRegionScorerRequest(Server server, Pair<Olive.Plugin, Olive.Domain> pd, String filename)
throws ClientException, IOException, UnsupportedAudioFileException {
if (null == pd) {
return false;
}
// Create a callback to handle results from the server
Server.ResultCallback<Olive.RegionScorerRequest, Olive.RegionScorerResult> rc = new Server.ResultCallback<Olive.RegionScorerRequest, Olive.RegionScorerResult>() {
@Override
public void call(Server.Result<Olive.RegionScorerRequest, Olive.RegionScorerResult> r) {
// do something with the results:
if (!r.hasError()) {
log.info("Received {} region scores:", r.getRep().getRegionCount());
for (Olive.RegionScore rs : r.getRep().getRegionList()) {
log.info("{} ({}-{}secs, score={})", rs.getClassId(), rs.getScore(), rs.getStartT(),
rs.getEndT());
}
} else {
log.error("Region scoring error: {}", r.getError());
}
System.exit(0);
}
};
return ClientUtils.requestRegionScores(server, pd, filename, 0, rc, true, AudioTransferType.SEND_SERIALIZED_BUFFER, new ArrayList<>(),
new ArrayList<>(), new ArrayList<>());
}
}
TMT Request
The following code shows a full implementation of a text machine translation request made to a TMT plugin. The request in this client is a TextTransformationRequest
.
import java.util.List;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import com.sri.speech.olive.api.Olive;
import com.sri.speech.olive.api.Server;
import com.sri.speech.olive.api.client.ClientException;
import com.sri.speech.olive.api.utils.ClientUtils;
import com.sri.speech.olive.api.utils.Pair;
public class MyTMTRequest {
private static final Logger log = LoggerFactory.getLogger(MyTMTRequest.class);
private static final int TIMEOUT = 10000;
private static final String DEFAULT_SERVERNAME = "localhost";
private static final int DEFAULT_PORT = 5588;
private static String pluginName = "tmt-neural-v1.1.1";
private static String domainName = "cmn-eng-nmt-v1";
public static void main(String[] args) throws ClientException {
// input text is passed as an argument
String inputText = args[0];
// establish server connection
Server server = new Server();
server.connect(
"exampleClient", //client-id
DEFAULT_SERVERNAME, //address of server
DEFAULT_PORT, //request-port
DEFAULT_PORT + 1, //status-port
TIMEOUT //timeout for failed connection request
);
List<Pair<Olive.Plugin, Olive.Domain>> pluginList = ClientUtils.requestPlugins(server);
Pair<Olive.Plugin, Olive.Domain> pluginDomainPair = ClientUtils.findPluginDomain(pluginName, domainName, pluginList);
// build the request
Olive.TextTransformationRequest.Builder req = Olive.TextTransformationRequest.newBuilder()
.setText(inputText)
.setPlugin(pluginDomainPair.getFirst().getId())
.setDomain(pluginDomainPair.getSecond().getId());
// submit request
log.info(String.format("Submitting %s for translation with plugin %s and domain %s", inputText, pluginDomainPair.getFirst().getId(), pluginDomainPair.getSecond().getId()));
Server.Result<Olive.TextTransformationRequest, Olive.TextTransformationResult> result = server.synchRequest(req.build());
// handle response
if(!result.hasError()) {
for(Olive.TextTransformation transformation : result.getRep().getTransformationList()) {
log.info(transformation.getTransformedText());
}
} else{
log.error(String.format("Translation error", result.getError()));
}
// disconnect
server.disconnect();
}
}
SAD Adaptation Request
Below is a full implementation of a SAD request to adapt a new domain from an existing domain. The list of adaptation training files is passed into the client as a file. The code handles both supervised (speech regions specified) and unsupervised (speech regions not specified) SAD adaptations. However, some SAD plugins may not have the unsupervised adaptation capability, in which case the client will exit with a failure message.
import com.sri.speech.olive.api.Olive;
import com.sri.speech.olive.api.Server;
import com.sri.speech.olive.api.Olive.AnnotationRegion;
import com.sri.speech.olive.api.Olive.AudioAnnotation;
import com.sri.speech.olive.api.Olive.Domain;
import com.sri.speech.olive.api.Olive.Plugin;
import com.sri.speech.olive.api.utils.*;
import com.sri.speech.olive.api.utils.parser.LearningParser;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import java.util.*;
public class MySADAdaptation {
private static Logger log = LoggerFactory.getLogger(MySADAdaptation.class);
private static final int TIMEOUT = 10000;
private static final String DEFAULT_SERVERNAME = "localhost";
private static final int DEFAULT_PORT = 5588;
private static String domainName = "multi-v1";
private static String pluginName = "sad-dnn-v8.0.0";
private static String newDomainName = "custom-v1";
private static LearningParser learningParser = new LearningParser();
public static void main(String[] args) throws Exception {
// audio file list is passed as an argument
String audioFileList = args[0];
// parse files in list
learningParser.parse(audioFileList);
if (!learningParser.isValid()) {
log.error("Invalid input file: " + audioFileList);
System.exit(-1);
}
// Setup the connection to the OLIVE server
Server server = new Server();
server.connect("exampleClient", DEFAULT_SERVERNAME,
DEFAULT_PORT,
DEFAULT_PORT + 1,
TIMEOUT); // may need to adjust timeout
// wait for the connection
long start_t = System.currentTimeMillis();
while (!server.getConnected().get() && System.currentTimeMillis() - start_t < TIMEOUT) {
try {
synchronized (server.getConnected()) {
server.getConnected().wait(TIMEOUT);
}
} catch (InterruptedException e) {
// Keep waiting
}
}
// report if connection fails
if (!server.getConnected().get()) {
log.error("Unable to connect to the OLIVE server: {}", DEFAULT_SERVERNAME);
throw new Exception("Unable to connect to server");
}
// ask the server for a list of current plugins
List<Pair<Olive.Plugin, Olive.Domain>> pluginList = ClientUtils.requestPlugins(server);
// from the list of plugins, find the targeted plugin for the task
Pair<Olive.Plugin, Olive.Domain> pd = ClientUtils.findPluginDomainByTrait(pluginName, domainName, "SAD",
learningParser.isUnsupervised() ? Olive.TraitType.UNSUPERVISED_ADAPTER
: Olive.TraitType.SUPERVISED_ADAPTER,
pluginList);
// Preproces audio - doesn't matter if supervised or unsupervised
String adaptID = UUID.randomUUID().toString();
Plugin plugin = pd.getFirst();
Domain domain = pd.getSecond();
// optional annotations, generated if found in the parser (supervised
// adaptation)
Map<String, List<AudioAnnotation>> annotations = new HashMap<>();
int numPreprocessed = 0;
for (String filename : learningParser.getFilenames()) {
try {
// build the audio
Olive.Audio.Builder audio = ClientUtils.createAudioFromFile(filename, -1,
ClientUtils.AudioTransferType.SEND_SERIALIZED_BUFFER, null);
// build up request metadata
String id = null;
Collection<String> classIDs = learningParser.getAnnotations(filename).keySet();
if (classIDs.size() > 0) {
id = "supervised";
}
// Prepare the request
Olive.PreprocessAudioAdaptRequest.Builder req = Olive.PreprocessAudioAdaptRequest.newBuilder()
.setPlugin(plugin.getId())
.setDomain(domain.getId())
.setAdaptSpace(adaptID)
// We don't set the optional start/end regions... those are used later when
// we finalize
.setAudio(audio.build());
if (id != null) {
req.setClassId(id);
}
// send the request
Server.Result<Olive.PreprocessAudioAdaptRequest, Olive.PreprocessAudioAdaptResult> result = server
.synchRequest(req.build());
if (result.hasError()) {
log.error(String.format("Error preprocessing audio %s because: %s", filename,
result.getError()));
} else {
numPreprocessed += 1;
log.info(String.format("Audio file %s successfully preprocessed", filename));
String audioId = result.getRep().getAudioId();
// Set the audio ID for any classID(s) associated with this audio
for (String classIDName : learningParser.getAnnotations(filename).keySet()) {
// Add the class/audio id mapping , and optionally add annotation regions
List<AudioAnnotation> audioAnnots;
if (annotations.containsKey(classIDName)) {
audioAnnots = annotations.get(classIDName);
} else {
audioAnnots = new ArrayList<>();
annotations.put(classIDName, audioAnnots);
}
Olive.AudioAnnotation.Builder aaBuilder = Olive.AudioAnnotation.newBuilder()
.setAudioId(audioId);
for (RegionWord word : learningParser.getAnnotations(filename).get(classIDName)) {
AnnotationRegion.Builder ab = AnnotationRegion.newBuilder()
.setStartT(word.getStartTimeSeconds()).setEndT(word.getEndTimeSeconds());
aaBuilder.addRegions(ab.build());
}
audioAnnots.add(aaBuilder.build());
}
}
} catch (Exception /* | UnsupportedAudioFileException */ e) {
log.error("Unable to preprocess file: " + filename);
log.debug("File preprocess error: ", e);
}
}
// perform adaptation
if (learningParser.isUnsupervised()) {
if (numPreprocessed > 0) {
// Prepare the request
Olive.UnsupervisedAdaptationRequest.Builder req = Olive.UnsupervisedAdaptationRequest.newBuilder()
.setPlugin(plugin.getId())
.setDomain(domain.getId())
.setAdaptSpace(adaptID)
.setNewDomain(newDomainName);
// Now send the finalize request
Server.Result<Olive.UnsupervisedAdaptationRequest, Olive.UnsupervisedAdaptationResult> result = server
.synchRequest(req.build());
if (result.hasError()) {
log.error(String.format("Unsupervised adaptation failed for new domain '%s' because: %s",
newDomainName, result.getError()));
} else {
log.info(String.format("New Domain '%s' Adapted", newDomainName));
}
} else {
log.error("Can not adapt domain because all audio preprocessing attempts failed.");
}
} else {
// supervised adaptation
if (numPreprocessed > 0) {
List<Olive.ClassAnnotation> classAnnotations = new ArrayList<>();
for (String id : annotations.keySet()) {
Olive.ClassAnnotation.Builder caBuilder = Olive.ClassAnnotation.newBuilder().setClassId(id)
.addAllAnnotations(annotations.get(id));
classAnnotations.add(caBuilder.build());
}
// Prepare the request
Olive.SupervisedAdaptationRequest.Builder req = Olive.SupervisedAdaptationRequest.newBuilder()
.setPlugin(plugin.getId())
.setDomain(domain.getId())
.setAdaptSpace(adaptID)
.setNewDomain(newDomainName)
.addAllClassAnnotations(classAnnotations);
// Now send the finalize request
Server.Result<Olive.SupervisedAdaptationRequest, Olive.SupervisedAdaptationResult> result = server
.synchRequest(req.build());
if (result.hasError()) {
log.error(String.format("Failed to adapt new Domain '%s' because: %s", newDomainName,
result.getError()));
} else {
log.info(String.format("New Domain '%s' Adapted", newDomainName));
}
} else {
log.error("Can not adapt domain because all audio preprocessing attempts failed.");
}
}
log.info("");
log.info("Learning finished. Exiting...");
System.exit(0);
}
}