Documentation Index
Fetch the complete documentation index at: https://vowen.mintlify.app/llms.txt
Use this file to discover all available pages before exploring further.
Understand the recording and transcription pipeline.

Recording Modes
Vowen offers two ways to record:Push-to-Talk (Default)
Hold your shortcut to record, release to transcribe. This is the fastest way to dictate short phrases and sentences.| Platform | Default Shortcut |
|---|---|
| macOS | Fn |
| Windows | Ctrl + Shift |
You can pick your own shortcut. See different setups here.
Hands-Free
Toggle recording on or off without holding a key. Ideal for longer dictation sessions.| Platform | Default Shortcut |
|---|---|
| macOS | Fn + Control |
| Windows | Ctrl + H |
You can pick your own shortcut. See different setups here.
Hands-free recording can be stopped by:
- Pressing the shortcut again
- Clicking the stop button in the indicator
- Auto-deactivate on mouse or keystroke (optional setting)
Escape while recording. A notification appears with an Undo button for a few seconds in case you cancelled by mistake. Click Undo to resume the same session with the audio you already recorded. If the notification dismisses without action, the audio is discarded and nothing is transcribed.
Pro tip: beyond dictation, Vowen can run voice-triggered actions like compressing images, merging PDFs, opening apps, setting timers, and translating text. See Command Mode →
Recording Indicator
When you start recording, a small pill-shaped indicator appears on your screen showing that Vowen is listening.
- Active app icon: which app will receive your transcription
- Waveform animation: pulses while you speak
- Pause and stop controls: appear in hands-free mode only
The Transcription Pipeline
Audio Capture
Your microphone captures audio while the shortcut is held. Voice Activity Detection (VAD) automatically removes silence for faster processing.
Transcription
The audio is sent to your chosen transcription model, either a local model running on your machine or a cloud model.
Post-Processing
Filler words are removed. Snippet replacements (Threads) are applied. Workflow triggers are checked.
AI Enhancement (if enabled)
The transcribed text is sent to your configured AI provider for grammar cleanup, formatting, and polish. You bring your own API key from any of 10+ supported providers; Vowen never charges you for AI usage.
Text Insertion
The final text is delivered to the focused field using either the paste method (default) or direct insertion. See Text Insertion Methods below for the difference.
Text Insertion Methods
Vowen offers two ways to deliver the final transcription into the focused field. The choice matters for clipboard behaviour and for keyboard layouts.Paste method (default)
Vowen copies the transcription to your clipboard and simulates aCmd+V (macOS) or Ctrl+V (Windows) keystroke to paste it. This is the fastest path and works well on standard QWERTY layouts.
Side effect: your original clipboard content is overwritten by the transcription. Enable Restore clipboard after paste in Settings > General to have Vowen save your prior clipboard contents and put them back after the paste completes.
Direct insertion method
Vowen types each character of the transcription as if you were pressing the keys yourself. The clipboard is never touched, so whatever you had on it stays exactly as it was. Use this method when:- You use a non-QWERTY layout (AZERTY, QWERTZ, Dvorak, and others) where the paste keystroke does not map cleanly to the “V” key
- The target app blocks standard paste (some remote desktops, sandboxed terminals, virtual machines)
- You want clipboard preservation without enabling a separate setting
Voice Activity Detection (VAD)
Vowen uses the Silero VAD model to detect speech in your recording. This:- Removes silence before and after speech
- Reduces processing time for local models
- Prevents “hallucinations” on silent recordings (e.g., the model outputting “Thank you” when nothing was said)
