OLIVE Supported Audio Formats
Overview
There are five main methods of interacting with the OLIVE system that carry different implications with respect to the support audio formats. They are as follows:
- NIGHTINGALE (Forensic) GUI – Submitting audio files using the Nightingale GUI, also known as the Forensic GUI.
- OLIVE (Batch) GUI – Submitting audio files using the OLIVE Batch GUI, also known as the Batch GUI or SCENIC Batch GUI, for submitting large file-based audio processing jobs.
- COMMAND LINE INTERFACE (CLI) TOOLS – submitting audio through the localenroll, localanalyze, localtrain command line tools.
- OLIVE API (BUFFERED) – sending pre-loaded memory buffers of audio samples to the server through the OLIVE API.
- OLIVE API (SERIALIZED) – sending a serialized object to the server that consists of an entire audio file with its header intact.
These interaction methods can be combined into three groups that share limitations:
- Local file-based processing by server.
- Dictates compatibility for 2. OLIVE Batch GUI, 3. Command Line Interface Tools, and 5. OLIVE API (Serialized).
- Audio files being opened and processed by Java.
- Dictates compatibility for 1. Nightingale GUI.
- Audio samples buffered into memory.
- Dictates compatibility for 4. OLIVE API (Buffered).
The limitations of each group are defined below.
Local file-based processing by server
The audio file compatibility for this group of OLIVE interactions is dictated by the libsndfile for reading and writing audio files. All files submitted to the OLIVE Batch GUI, through the localenroll, localanalyze, and localtrain command line tools, or as Serialized files through the OLIVE API can be of any audio file format and type supported by the libsndfile package.
Stereo files are supported, but are merged into a single channel before scoring when submitting to the OLIVE Batch GUI and Command Line Interface Tools. When submitting files as serialized objects through the API, there is flexibility regarding how the channels are processed – please refer to the API Documentation for more details.
Supported audio formats include:
- Microsoft WAV
- SGI/Apple AIFF/AIFC
- Sun AU/Snd
- Raw (headerless)
- Paris Audio File (PAF)
- Commodore IFF/SVX
- Sphere/NIST WAV
- IRCAM SF
- Creative VOC
- SoundForge W64
- GNU Octave MAT4.4
- Portable Voice Format
- Fasttracker 2 XI
- HMM Tool Kit HTK
- Apple CAF
- Sound Designer II SD2
- Free Lossless Audio Codec (FLAC)
Supported encodings vary by the format used (see the link below for a comprehensive compatibility table), but samples of several supported encodings are as follows:
- Unsigned and signed 8, 16, 24 and 32 bit PCM
- IEEE 32 and 64 floating point
- U-LAW
- A-LAW
- IMA ADPCM
- MS ADPCM
- GSM 6.10
- G721/723 ADPCM
- 12/16/24 bit DWVW
- OK Dialogic ADPCM
- 8/16 DPCM
More information on libsndfile supported audio formats can be found here: http://www.mega-nerd.com/libsndfile/#Features
Audio files being opened and processed by Java
Any files being opened in the Nightingale GUI for close analysis work must be able to be opened and read by the underlying Java code libraries:
- Java Media Framework
- JavaX.Sound
- FLAC
Only mono files are currently supported.
Supported sample rates include:
- 8 kHz multiples to 48 kHz
- 11.025 kHz multiples to 44.1 kHz
Supported container formats:
- FLAC
- RIFF (.wav)
- AIFF
- AIFC
- AU
Supported encoding formats:
- Compressed:
- FLAC
- PCM:
- 16 bit signed int, big or little endian
- 8 bit signed or unsigned int
- 32 bit float, little endian only (i.e. RIFF, not RIFX)
- 8 bit mulaw or alaw
- ADPCM:
- Microsoft or IMA ADPCM
Audio samples buffered into memory
Raw buffered audio samples being sent to the OLIVE server for enrollment or scoring are read and processed under the assumption that they are raw 16-bit Linear PCM sampes at an 8 kHz sampling rate.
Serialized buffered audio files sent to the server are not processed by the client or assumed to be anything specific; rather they are interpreted by the server as a complete (header-intact!) audio file.