OLIVE Hardware Requirements
Processor Hardware Restrictions
OLIVE is currently built, tested, and suported for running only on Intel x86_64 processor hardware. This includes consumer Core i-series processors like the i7, i9, etc., as well as the server-line Xeon processors.
ARM processors are currently not supported; this includes the new M1 and M2 chips from Apple.
Some plugins come with additional restrictions that have stricter requirements. The most notable at this point is for avx2 support from the CPU, this is required by our Neural Machine Translation plugin, as well as most of the low-resource targeted "SmOlive" plugins using quantized models. The "SmOlive" plugins typically have a "smart" domain that can back-off to a less-resource-streamlined set of models if the avx2 support is not discovered and loading the primary quantized model(s) has failed, but there is no back-off for Machine Translation. Typically, processors newer than roughly 2015 have this support, so it should only come into consideration in rare circumstances.
Plugins currently requiring avx2 support:
- tmt-neural-v1
- sad-dnnSmolive-v1 (Will back off to non-quantized models on unsupported hardware)
- sid-embedSmolive-v1 (Will back off to non-quantized models on unsupported hardware)
- lid-embedSmolive-v1 (Will back off to non-quantized models on unsupported hardware)
- sdd-diarizeEmbedSmolive-v1 (Will back off to non-quantized models on unsupported hardware)
- asr-end2end-v1
GPU Requirements
Some plugins are now capable of performing some operations on compatible GPU hardware, if available, to take advantage of the GPU architecture to increase processing speed. In order to take advantage of this, at least one Nvidia GPU must be properly installed on the system running the OLIVE software. This GPU must support at least CUDA 11.6 or newer, and meet the following installed driver minimum requirements:
- Linux x86_64: 450.80.02 or newer
- Windows: 452.39 or newer
Refer to Nvidia for drivers and more information about CUDA and GPU driver compatibility.
GPU Configuration
For information on GPU configuration, please refer to the appropriate documentation page.
Speed and Memory Requirements
Notes and Disclaimer for the Resource Requirement Estimates Provided
A few performance-related things that might be important to note:
- These estimates provided below were recorded on a native linux installation of OLIVE 5.1.0 which may be slightly different than running the equivalent job on Windows or in a Docker based environment. Note that OLIVE 5.2.0 and 5.3.0 have memory improvements that are not yet reflected here; these statistics will be updated when new results are available.
- Any speed estimates we give are going to be hardware dependent. The numbers reported below should be pessimistic, as they are limited to running on a single core of a low-power computer. Stronger cores will be faster than what's reported, weaker CPU cores will run a bit slower. If you have more than 1 processor core(s) available, which is likely, OLIVE is able to parallelize and run more jobs simultaneously, so the speed should scale accordingly - but we're reporting single-core jobs just to keep everything on the same relative scale so that you can compare plugins to each other.
- Just as speed will increase/scale as the number of processor cores being used increases (i.e. number of simultaneous jobs), so will memory usage. The provided stats are for a process limited to one job at a time.
- Memory usage scales sometimes significantly depending on how large the input audio files are. For a plugin like SAD, that has a very small base memory footprint, which only barely increases even if processing many, many small files, can see a much larger memory utilization if you start running 1GB+ audio files through.
- ASR performance is largely domain-dependent - for example, the Russian domain currently has a much larger language model than other plugins because of how the language is structured, and priorities of the project that funded its development, so its memory usage is quite significant, \~9+GB per processor core. Another thing to note for ASR speed performance is that because of how sizable the models can be, the overhead of the loading time of this model into memory can really come into play. It may take some time to get an initial response back from the server due to this overhead, but subsequent responses should be much faster as this 'heavy lifting' is already done. Note that if you're just running the CLI tools like localanalyze, this loading must be performed every time, so you won't realize this speedup unless you're running with the OLIVE server. If you are using the OLIVE server, it's possible to send a 'preload' request to load a plugin's models before any audio is submitted for scoring and avoid this initial delay.
- The models for each ASR/TPD domain/language are disjoint, and take up separate memory footprints. So if you would like to run data through both the Russian and English ASR domains when running the OLIVE server, the models, once loaded, are retained in memory for later processing and you may quickly run out of memory. For example, if you run a Russian job, a minimum of ~9GB of memory will be used. If you run an English job shortly after, this will load another ~6GB or so worth of models into memory. If you have insufficient memory, you will need to either explicity unload plugins/domains using API calls (not a feature currently offered by our GUI), or will need to restart the server to clear out the loaded models.
- QBE performance will depend on how many queries/keywords are currently enrolled - as more queries are enrolled and need to be considered during the search, the speed of the plugin will decrease.
- Some of the statistics below may be extra pessimistic because some of these readings will depend on how much of the input audio actually contains speech. If you have a 3 hour file, but only 5 minutes of it is speech, many of these plugins can be much, much faster and use less memory, because the task-specific processing (LID, for example) will only process audio that is identified as speech, and so will be operating on a much smaller piece of the audio than the whole file. The audio used to generate these numbers is pretty packed with speech, so should be close to a 'worst case.'
Plugin Resource Requirement Estimates
With that out of the way, here is a summary for most of the plugins:
plugin / domain | speed | mem (1 min) | mem (2 hr) |
---|---|---|---|
sad-dnn-v7.0.1 / fast-multi-v1 | 214.1 | 105 MB | 766 MB |
sad-dnn-v7.0.1 / multi-v1 | 90.6 | 127 MB | 775 MB |
gid-gb-v2.0.0 / clean-v1 | 354.9 | 161 MB | 1.56 GB |
ldd-sbcEmbed-v1.0.1 / multi-v1 | 18.5 | 582 MB | 5.04 GB |
lid-embedplda-v2.0.1 / multi-v1 | 29.2 | 660 MB | 3.15 GB |
qbe-tdnn-v5.0.0 / multi-v1* | 28.3 | 198 MB | 2.55 GB |
sdd-sbcEmbed-v2.0.2 / telClosetalk-v1 | 42.8 | 232 MB | 1.12 GB |
sid-dplda-v2.0.1 / multi-v1 | 42.2 | 296 MB | 2.38 GB |
asr-dynapy-v2.0.2 / rus-tdnnChain-tel-v1** | 10.6 | 8.94 GB | 10.19 GB |
tpd-dynapy-v3.0.0 / rus-cts-v1** | 6.3 | 7.46 GB | 8.66 GB |
* QBE Note: with 3 enrolled keywords
** ASR/TPD Note - the mem (2 hr) and speed statistics are generated from different data than the other plugins. That is because these plugins are language-dependent, and the data I used for the rest of the tests does not match the language of the domains I was running here. Feeding mismatched data into these plugins can cause both runtime and memory usage to balloon, as the plugin tries very hard to make sense of something that it's never seen before. Instead, 100 files adding to 2.5 hrs were used for the speed test and one 2-hr file was used for the mem (2 hr) test, but a different one than the rest of the plugins.
Note also that the Russian models are by far the largest delivered - this will change depending on which language/domain you are using, but these should represent a 'worst case' for ASR/TPD for planning purposes.
Speed numbers are reported in terms of "times faster than real time", and the numbers were reached by scoring 90 files adding up to approximately 5 hrs of data on a single core of a circa-2016 Gigabyte BRIX Compact PC (Intel i7-5500U 2.40 GHz processor). Higher is better, so for the slower SAD domain, which scores roughly 90 here, that means it can process a 90 second input file in 1 second.
Two memory points are provided for each plugin - the memory used to score a single 1 minute file, which should show roughly the baseline usage of the plugin, as well as memory used to score a single 2 hour file, to give a sense of how the usage scales as files grow. Lower is better.