Skip to content

OLIVE Supported Audio Formats

Overview

There are five main methods of interacting with the OLIVE system that carry different implications with respect to the support audio formats. They are as follows:

  1. NIGHTINGALE (Forensic) GUI – Submitting audio files using the Nightingale GUI, also known as the Forensic GUI.
  2. OLIVE (Batch) GUI – Submitting audio files using the OLIVE Batch GUI, also known as the Batch GUI or SCENIC Batch GUI, for submitting large file-based audio processing jobs.
  3. COMMAND LINE INTERFACE (CLI) TOOLS – submitting audio through the localenroll, localanalyze, localtrain command line tools.
  4. OLIVE API (BUFFERED) – sending pre-loaded memory buffers of audio samples to the server through the OLIVE API.
  5. OLIVE API (SERIALIZED) – sending a serialized object to the server that consists of an entire audio file with its header intact.

These interaction methods can be combined into three groups that share limitations:

The limitations of each group are defined below.

Local file-based processing by server

The audio file compatibility for this group of OLIVE interactions is dictated by the libsndfile for reading and writing audio files. All files submitted to the OLIVE Batch GUI, through the localenroll, localanalyze, and localtrain command line tools, or as Serialized files through the OLIVE API can be of any audio file format and type supported by the libsndfile package.

Stereo files are supported, but are merged into a single channel before scoring when submitting to the OLIVE Batch GUI and Command Line Interface Tools. When submitting files as serialized objects through the API, there is flexibility regarding how the channels are processed – please refer to the API Documentation for more details.

Supported audio formats include:

  • Microsoft WAV
  • SGI/Apple AIFF/AIFC
  • Sun AU/Snd
  • Raw (headerless)
  • Paris Audio File (PAF)
  • Commodore IFF/SVX
  • Sphere/NIST WAV
  • IRCAM SF
  • Creative VOC
  • SoundForge W64
  • GNU Octave MAT4.4
  • Portable Voice Format
  • Fasttracker 2 XI
  • HMM Tool Kit HTK
  • Apple CAF
  • Sound Designer II SD2
  • Free Lossless Audio Codec (FLAC)

Supported encodings vary by the format used (see the link below for a comprehensive compatibility table), but samples of several supported encodings are as follows:

  • Unsigned and signed 8, 16, 24 and 32 bit PCM
  • IEEE 32 and 64 floating point
  • U-LAW
  • A-LAW
  • IMA ADPCM
  • MS ADPCM
  • GSM 6.10
  • G721/723 ADPCM
  • 12/16/24 bit DWVW
  • OK Dialogic ADPCM
  • 8/16 DPCM

More information on libsndfile supported audio formats can be found here: http://www.mega-nerd.com/libsndfile/#Features

Audio files being opened and processed by Java

Any files being opened in the Nightingale GUI for close analysis work must be able to be opened and read by the underlying Java code libraries:

  • Java Media Framework
  • JavaX.Sound
  • FLAC

Only mono files are currently supported.

Supported sample rates include:

  • 8 kHz multiples to 48 kHz
  • 11.025 kHz multiples to 44.1 kHz

Supported container formats:

  • FLAC
  • RIFF (.wav)
  • AIFF
  • AIFC
  • AU

Supported encoding formats:

  • Compressed:
    • FLAC
  • PCM:
    • 16 bit signed int, big or little endian
    • 8 bit signed or unsigned int
    • 32 bit float, little endian only (i.e. RIFF, not RIFX)
    • 8 bit mulaw or alaw
  • ADPCM:
    • Microsoft or IMA ADPCM

Audio samples buffered into memory

Raw buffered audio samples being sent to the OLIVE server for enrollment or scoring are read and processed under the assumption that they are raw 16-bit Linear PCM sampes at an 8 kHz sampling rate.

Serialized buffered audio files sent to the server are not processed by the client or assumed to be anything specific; rather they are interpreted by the server as a complete (header-intact!) audio file.