gdd-embedplda-v1 (Gender Detection)
Version Changelog
Plugin Version | Change |
---|---|
v1.0.0 | Initial plugin release with OLIVE 5.2.0 |
Description
Gender Detection plugins will detect and label the gender of the speaker for regions of speech in a submitted audio segment. This is in contrast to Gender Identification (GID) plugins, which label the entire segment with a single gender. So, unlike Gender Identification (GID), GDD is capable of handling audio where multiple speakers of a different gender are speaking, and will provide timestamp region labels to point to label male and female regions.
Domains
- multi-v1
- Generic domain for most close talking conditions with signal-to-noise ratio above 10 dB.
Inputs
Audio file or buffer and an optional identifier.
Outputs
GDD plugins return a list of regions with a score for the detected gender within that region. The starting and stopping boundaries are denoted in seconds. As with LID, scores are log-likelihood ratios, where a score greater than the default threshold of "0" is considered to be a detection.
An example output excerpt:
input-audio.wav 0.000 41.500 Female 5.40590000
input-audio.wav 43.500 77.500 Female 5.29558277
input-audio.wav 77.500 78.500 Male 2.63369179
input-audio.wav 78.500 80.500 Male 2.25519705
input-audio.wav 85.500 86.500 Female 2.06612849
input-audio.wav 97.500 98.500 Female 3.74665093
input-audio.wav 98.500 99.500 Male 2.22936487
input-audio.wav 105.500 106.500 Male 2.72254372
input-audio.wav 107.500 108.500 Female 2.60355234
input-audio.wav 108.500 110.500 Female 2.76414633
input-audio.wav 109.500 113.140 Male 2.85003138
Functionality (Traits)
The functions of this plugin are defined by its Traits and implemented API messages. A list of these Traits is below, along with the corresponding API messages for each. Click the message name below to go to additional implementation details below.
- REGION_SCORER – Score all submitted audio, returning labeled regions within the submitted audio where each region includes a detected gender and corresponding score for this gender
Compatibility
OLIVE 5.2+
Limitations
Known or potential limitations of the plugin are outlined below.
Minimum Speech Duration
The system will only attempt to perform gender detection if the submitted audio segment contains more than 2 seconds of detected speech.
Comments
Global Options
The following options are available to this plugin, adjustable in the plugin's configuration file; plugin_config.py.
Option Name | Description | Default | Expected Range |
---|---|---|---|
threshold | Detection threshold: Higher value results in less detections being output, but of higher reliability. | 1.5 | -10.0 to 10.0 |
min_speech | The minimum length that a speech segment must contain in order to be scored/analyzed for gender. | 2.0 | 1.0 - 4.0 |
sad_threshold | SAD threshold for determining the audio to be used in meteadata extraction | 1.0 | -5.0 - 6.0 |