Skip to content

How it works

Architecture overview

At a high level, ViveEngine consists of the three major components:

  • Gateway is an entry point and router for everything else. It exposes the control interface (gRPC/REST), wires components together, and handles multi-seat support.
  • Orchestrator runs on behalf of logged-in users. It manages audio pipelines — chains effects, controls routing, loads plugins.
  • Driver injects into the OS audio system. It intercepts and reroutes audio streams by request from the engine.

From the app's perspective, these are just implementation details. It talks to the engine as a whole via its public API, and engine internal structure is not exposed.

Platform differences

Windows macOS
Driver type Audio Processing Object (APO) Virtual audio device (HAL plugin)
How it works Hooks into Windows audio pipeline as a per-device or per-app effect Creates a virtual device that becomes system default; audio flows through shared memory to the engine
Where processing happens Inside the audio server process In the orchestrator process
Latency overhead Near-zero Few milliseconds

Built-in effects

ViveEngine includes several commonly used audio processing effects out of the box. You can combine them with each other or with your own processing, depending on your needs.

Audio enhancements

Effect Description
Amplifier (AMP) Per-channel volume adjustment
Equalizer (EQ) Parametric equalizer
Noise Reduction (NR) Suppresses background noise. Supports neural network-based suppression and classic spectral subtraction
Dynamic Range Compression (DRC) Levels out volume differences, boosts quiet sounds

Live captions

ViveEngine can transcribe audio streams in real-time via AWS Transcribe.

Supports 12 languages with automatic language detection.

Plugin system

ViveEngine supports custom plugins to extend audio processing beyond the built-in effects.

Plugins are injected into app or device processing pipelines. You can chain multiple plugins together and combine them with built-in effects. Different pipelines can have different configurations.

Native plugins (in-process)

Native plugins are loaded directly into the engine process. They're the fastest option — zero additional latency beyond your processing time.

Aspect Details
Language C/C++ or other natively compiled with C ABI
Communication C ABI
Latency Zero overhead
Use case Real-time effects like noise suppression or voice transformation

Native plugins are loaded into the driver or orchestrator address space (depending on platform) and hence have higher requirements to stability and performance.

Managed plugins (out-of-process)

In development

Managed plugins are currently in development. Feel free to contact us if you're interested.

Managed plugins run in separate processes and communicate with the engine via shared memory. This adds a few milliseconds of latency but gives you more flexibility.

Aspect Details
Language Go, C#, or other high-level languages
Communication Shared memory (audio), pipes (control messages)
Latency A few milliseconds
Use case Cloud APIs, transcription, streaming, analytics

The isolation ensures that a plugin won't crash or slow down the engine. This is good for integrating external services or running code that isn't strictly real-time safe.

Info

The built-in live captions feature is implemented as a managed plugin.

Control interface

Plugins allow custom processing. But you also need a way to control the engine as a whole — configure pipelines, manage routing, subscribe to events. That's what the control interface is for.

gRPC API

The engine exposes a public API via a gRPC endpoint on localhost. Your app connects and sends commands.

Using gRPC has a few benefits:

  • Language-agnostic: Works with any language — C++, Python, Go, C#, Java, Node.js, Rust, whatever your stack is. Generate client code from the protobuf definitions.
  • Efficient: Low overhead and latency. Important for responsive applications.
  • Schema-based: Strong typing and validation. Built-in backwards compatibility.

REST API

In development

REST API is currently development. Feel free to contact us if you're interested.

REST API is an alternative to gRPC. It provides the same functionality, but over convenient HTTP/JSON.

It is useful if you're working with dynamic languages like JavaScript or want a simpler integration without code generation.

CLI tools

ViveEngine ships with two command-line tools:

  • Administration tool: Manage local engine installation — install, uninstall, activate license, etc.
  • Client tool (vive-ctl): Complete access to public API from console. Useful for development and scripting.

Here's the revised version:

Minimal overhead

ViveEngine is designed to add as little overhead as possible around your processing.

  • Direct injection. When possible, the engine injects processing directly into the OS audio pipeline to avoid costs of inter-process communication.

  • Shared memory streaming. Otherwise, audio flows through lock-free ring buffers in shared memory. This ensures predictable real-time performance.

  • Typical latency is a few milliseconds. Or zero when direct injection is possible. This doesn't include your plugin's processing time.

Fail-safe design

ViveEngine is designed so that failures in the engine don't break audio on the user machine. A few key decisions make this possible:

  1. Driver is minimal — The driver just passes audio to and from the engine. All heavy lifting happens in the orchestrator.

  2. Driver detaches on disconnect — When the driver loses connection to the engine, it removes itself from the audio path entirely. Audio flows through the OS as if ViveEngine wasn't installed.

  3. Automatic restart and reconnect — All components recover on their own. Once everything is up, the engine restores normal operation.

This ensures that even if your plugin crashes or the engine hits a bug, users keep hearing audio. In the worst case, they lose effects until recovery.

Multi-seat support

ViveEngine supports multi-user environments like shared workstations or enterprise deployments.

Each logged-in user gets their own:

  • Configuration — Settings, effect presets, and plugin configuration are stored per-user.
  • License activation — Licenses are tied to individual users, not just the machine. Each user activates separately.

The engine tracks user switches and ensures that only currently active user can access audio devices. Processing pipelines and API sessions of other users are suspended.

Compatibility considerations

By design, the engine is transparent — the majority of audio apps and devices will work without any tweaks.

That said, some hardware and software have quirks and may require special handling. We take compatibility seriously and maintain a growing list of applications and audio equipment that we regularly test for regressions.

Have specific requirements?

If you need to ensure compatibility with specific hardware or software, feel free to contact us.