OLIVE Supported Audio Formats

Overview

There are five main methods of interacting with the OLIVE system that carry different implications with respect to the support audio formats. They are as follows:

NIGHTINGALE (Forensic) GUI – Submitting audio files using the Nightingale GUI, also known as the Forensic GUI.
OLIVE (Batch) GUI – Submitting audio files using the OLIVE Batch GUI, also known as the Batch GUI or SCENIC Batch GUI, for submitting large file-based audio processing jobs.
COMMAND LINE INTERFACE (CLI) TOOLS – submitting audio through the localenroll, localanalyze, localtrain command line tools.
OLIVE API (BUFFERED) – sending pre-loaded memory buffers of audio samples to the server through the OLIVE API.
OLIVE API (SERIALIZED) – sending a serialized object to the server that consists of an entire audio file with its header intact.

These interaction methods can be combined into three groups that share limitations:

Local file-based processing by server.
- Dictates compatibility for 2. OLIVE Batch GUI, 3. Command Line Interface Tools, and 5. OLIVE API (Serialized).
Audio files being opened and processed by Java.
- Dictates compatibility for 1. Nightingale GUI.
Audio samples buffered into memory.
- Dictates compatibility for 4. OLIVE API (Buffered).

The limitations of each group are defined below.

Local file-based processing by server

The audio file compatibility for this group of OLIVE interactions is dictated by the libsndfile for reading and writing audio files. All files submitted to the OLIVE Batch GUI, through the localenroll, localanalyze, and localtrain command line tools, or as Serialized files through the OLIVE API can be of any audio file format and type supported by the libsndfile package.

Stereo files are supported, but are merged into a single channel before scoring when submitting to the OLIVE Batch GUI and Command Line Interface Tools. When submitting files as serialized objects through the API, there is flexibility regarding how the channels are processed – please refer to the API Documentation for more details.

Supported audio formats include:

Microsoft WAV
SGI/Apple AIFF/AIFC
Sun AU/Snd
Raw (headerless)
Paris Audio File (PAF)
Commodore IFF/SVX
Sphere/NIST WAV
IRCAM SF
Creative VOC
SoundForge W64
GNU Octave MAT4.4
Portable Voice Format
Fasttracker 2 XI
HMM Tool Kit HTK
Apple CAF
Sound Designer II SD2
Free Lossless Audio Codec (FLAC)

Supported encodings vary by the format used (see the link below for a comprehensive compatibility table), but samples of several supported encodings are as follows:

Unsigned and signed 8, 16, 24 and 32 bit PCM
IEEE 32 and 64 floating point
U-LAW
A-LAW
IMA ADPCM
MS ADPCM
GSM 6.10
G721/723 ADPCM
12/16/24 bit DWVW
OK Dialogic ADPCM
8/16 DPCM

More information on libsndfile supported audio formats can be found here: http://www.mega-nerd.com/libsndfile/#Features

Audio files being opened and processed by Java

Any files being opened in the Nightingale GUI for close analysis work must be able to be opened and read by the underlying Java code libraries:

Java Media Framework
JavaX.Sound
FLAC

Only mono files are currently supported.

Supported sample rates include:

8 kHz multiples to 48 kHz
11.025 kHz multiples to 44.1 kHz

Supported container formats:

FLAC
RIFF (.wav)
AIFF
AIFC
AU

Supported encoding formats:

Compressed:
- FLAC
PCM:
- 16 bit signed int, big or little endian
- 8 bit signed or unsigned int
- 32 bit float, little endian only (i.e. RIFF, not RIFX)
- 8 bit mulaw or alaw
ADPCM:
- Microsoft or IMA ADPCM

Audio samples buffered into memory

Raw buffered audio samples being sent to the OLIVE server for enrollment or scoring are read and processed under the assumption that they are raw 16-bit Linear PCM sampes at an 8 kHz sampling rate.

Serialized buffered audio files sent to the server are not processed by the client or assumed to be anything specific; rather they are interpreted by the server as a complete (header-intact!) audio file.