Omni Documentation

Everything you need to understand, extend, and ship Omni.
Looking for the quickstart? Jump to installation →

1What is Omni?

Omni is an offline-first AI voice assistant built with Python. It captures your speech, converts it to text, routes the text through a local Large Language Model (LLM) via Ollama, synthesises a spoken response with the kokoro TTS engine, and presents everything in a clean GUI.

Live Listening

Always-on mic capture

Local LLM

No cloud latency or cost

Waveform Display

Visual feedback after every utterance

Hotkey Toggle

Ctrl + J to start / stop

Custom Commands

"Omni open chrome" etc.

Cross-Platform

Windows, macOS, Linux

2Voice Commands

Speak naturally—Omni's LLM handles free-form conversation. The phrases below are intercepted as system commands and run instantly on your device.

Omni open chrome Omni type "hello world" Exit · Quit · Goodbye Omni

3Configuration

LLM model

The default model is llama3.2. Swap to any Ollama-compatible model by editing config/model.ts.

TTS voice

Omni uses kokoro overespeak-ng. Modify voice or pitch inside config/tts.ts.

Hotkey

Change the global toggle from Ctrl + J by altering the GLOBAL_HOTKEY constant at the top of Omni_voice.py.

4Extending Omni (Plugins)

Create a new Python file inside actions/ with a function that returns a string.
Import and register that function in index.ts.
Reference the action name in natural language—Omni will route the request automatically.

5Troubleshooting

Hotkey doesn't work

The keyboard library sometimes requires admin privileges. On Windows, run PowerShell → Run as Administrator → activate venv → python Omni_voice.py.

"No speech detected"

Check microphone gain, speak clearly, or increase the timeout value inside listen_for_command().

Audio file deletion (WinError 32)

We purge buffers with winsound.SND_PURGE. If it persists, another process is locking the file—reboot or use a different audio backend.

6Roadmap

Wake-word detection ("Hey Omni")
Offline speech-to-text backend
Plugin marketplace for third-party actions
Animated live waveform and subtle UI SFX