GPU-Capable OLIVE Plugin / Domain Configuration
Introduction
With the release of OLIVE 5.5.0, certain plugin capabilities are now able to leverage GPU hardware to enhance the speed of certain operations and algorithms. This allows us to either use advanced technologies that were previously infeasible to deploy without the GPU speed bump, or to sometimes dramatically increase the speed performance of existing technologies.
Currently supported GPU plugins
These plugins currently support GPU operation, when configured as outlined above:
- sad-dnn-v8.0.0+ (Speech Activity Detection)
- sid-dplda-v4.0.0+ (Speaker Identification)
- lid-embedplda-v4.0.0+ (Language Identification)
- tmt-neural-v1.1.0+ (Text Machine Translation)
- asr-end2end-v1.0.0+ (Speech Recognition)
All are capable of running on CPU as well, though in some cases (notably asr-end2end-v1) at a drastically reduced speed.
GPU Configuration Requirements
In order to use GPUs with an OLIVE software instance, three things need to happen;
- The hardware OLIVE is being installed and run on must have a compatible GPU or GPUs installed. Currently, OLIVE can only use Nvidia GPUs with CUDA cores and properly installed Nvidia drivers. If applicable, you will also need the Nvidia Docker toolkit. Refer to Nvidia's documentation for installation of these.
- OLIVE itself must be configured and launched so that it has access to these GPUs. Refer to the documentation specific to your delivery type for information on how to do this. The most likely appropriate reference for this is the Martini Startup Instructions.
- Each domain of each plugin that you wish to run on the GPU must be configured to specify this information. This is covered below.
Plugin Configuration for GPU Use
Enable GPU usage for a plugin's domain
To allow a plugin to run on an available GPU, it is crucial that the plugin:
- Has each desired domain configured to choose a GPU device in its meta.conf file
- Only has domains configured to select GPU device(s) that are properly exposed to the server by configuring/launghing the server appropriately
Plugin / Domain meta.conf location
The exact location of your delivered OLIVE plugins may depend on your delivery method; refer to the installation instructions for your package for exact details. But most of the time they can be found in olive5.5.2/oliveAppData/plugins/
. The file you must edit as outlined below can be found inside of each domain of the plugin; these can be found inside the plugin's domains/
folder:
olive5.5.2/oliveAppData/plugins/<plugin name>/domains/<domain name(s)>/meta.conf
For example, the sad-dnn-v8.0.0 multi-v1 domain meta.conf file can be found in:
olive5.5.2/oliveAppData/plugins/sad-dnn-v8.0.0/domains/multi-v1/meta.conf
Configuring a plugin to use a GPU is done at the domain level, by changing the device
variable assignment within the domain's meta.conf
file. By default, most plugins have this variable assigned to cpu
, and as such the domain will run on CPU only, even when running within a GPU-enabled OLIVE server. To assign a domain access to a GPU, change this device
assignment from cpu
to gpuN
where N
is the device ID of the GPU to run on, as reported by NVIDIA System Management Interface, or nvidia-smi
(more info here). As already discussed it is critical that the system's GPU device(s) assigned to the domain is exposed to the OLIVE server that will be running this domain. These device IDs can be verified by checking the top left entry for each GPU in the nvidia-smi
output; an example can be expanded below.
nvidia-smi Example Output (Click to expand)
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 495.46 Driver Version: 495.46 CUDA Version: 11.5 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 NVIDIA GeForce ... Off | 00000000:04:00.0 Off | N/A |
| 20% 28C P0 71W / 250W | 0MiB / 12212MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
| 1 NVIDIA GeForce ... Off | 00000000:05:00.0 Off | N/A |
| 20% 28C P0 73W / 250W | 0MiB / 12212MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
| 2 NVIDIA GeForce ... Off | 00000000:08:00.0 Off | N/A |
| 19% 28C P0 73W / 250W | 0MiB / 12212MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
| 3 NVIDIA GeForce ... Off | 00000000:09:00.0 Off | N/A |
| 19% 28C P0 73W / 250W | 0MiB / 12212MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
| 4 NVIDIA GeForce ... Off | 00000000:84:00.0 Off | N/A |
| 18% 26C P0 71W / 250W | 0MiB / 12212MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
| 5 NVIDIA GeForce ... Off | 00000000:85:00.0 Off | N/A |
| 19% 27C P0 71W / 250W | 0MiB / 12212MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
| 6 NVIDIA GeForce ... Off | 00000000:88:00.0 Off | N/A |
| 18% 28C P0 71W / 250W | 0MiB / 12212MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
| 7 NVIDIA GeForce ... Off | 00000000:89:00.0 Off | N/A |
| 17% 27C P0 73W / 250W | 0MiB / 12212MiB | 1% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| No running processes found |
+-----------------------------------------------------------------------------+
Most hardware setups will only have a single GPU available, so enabling this is simply replacing cpu
with gpu0
.
As an example, the meta.conf for the english-v1
domain of asr-end2end-v1.0.0
in its off-the-shelf format is shown below:
label: english-v1
description: Large vocabulary English wav2vec2 model for both 8K and 16K data
resample_rate: 8000
language: English
device: cpu
To instead configure this domain to run on the GPU with ID #0, as it is when delivered with OLIVE 5.5.2, this domain then becomes:
label: english-v1
description: Large vocabulary English wav2vec2 model for both 8K and 16K data
resample_rate: 8000
language: English
device: gpu0
Note
Each domain must be configured separately. If within a single plugin, one domain is configured for GPU, the others don't automatically start using a GPU. By default they will still run on CPU, which for most GPU-capable plugins will run significantly more slowly. By extension, it is not necessary for all domains of a plugin to run on the same GPU."
Having this device assignment at the domain level allows the distribution of domains across multiple GPUs. For example, if multiple GPUs are available on a system, lower-memory-usage plugins like SAD, SID, and LID may share a single GPU, while heavier plugins like ASR or MT can be flexible enough to assign different language domains of each plugin across multiple GPUs, spreading the memory load to minimize the chance of exhausting GPU memory, while also saving lost time frequently loading and unloading models.
Building off of the example above, if we wanted to run the russian-v1
domain of the same ASR plugin on GPU device #2, so that english-v1
and russian-v1
don't compete with respect to GPU memory, the russian-v1
domain would look like this:
label: russian-v1
description: Large vocabulary Russian wav2vec2 model for both 8K and 16K data
resample_rate: 8000
language: Russian
device: gpu2
The exact ideal configuration may vary greatly depending on specific customer use case and mission needs.
OLIVE GPU Restrictions and Configuration Notes
GPU Compute Mode: "Default" vs "Exclusive"
The OLIVE software currently assumes that any available GPUs are in the "default" mode. In testing, some configurations of number of OLIVE server workers have been found to cause unexpected issues if the GPUs are configured to be running in "Exclusive" mode. If possible, please configure GPUs that OLIVE will be using in "Default" mode, and if this is not possible, please ensure that the number of workers for the GPU-enabled server is specified to be 1. This is configured in the provided docker-compose.yml file.
To check the mode of your GPU, you can view the Compute Mode
field of the nvidia-smi
command output. Please refer to Nvidia's instructions for Nvidia Control Panel or the Usage instructions for nvidia-smi for more information on how to set these modes.