> ## Documentation Index
> Fetch the complete documentation index at: https://docs.vowen.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# How Transcription Works

<div style={{ marginTop: "-1rem" }}>
  Understand the recording and transcription pipeline.
</div>

<img className="block dark:hidden" src="https://mintcdn.com/vowen/2NLLQOm5sszam2ni/images/transcription-flow-light.png?fit=max&auto=format&n=2NLLQOm5sszam2ni&q=85&s=c4979a1c7b036298e93079dd44d3f442" alt="Voice transcription pipeline" width="1672" height="941" data-path="images/transcription-flow-light.png" />

<img className="hidden dark:block" src="https://mintcdn.com/vowen/2NLLQOm5sszam2ni/images/transcription-flow-dark.png?fit=max&auto=format&n=2NLLQOm5sszam2ni&q=85&s=80eb82c9fd371f257006d4655d2d425c" alt="Voice transcription pipeline" width="1672" height="941" data-path="images/transcription-flow-dark.png" />

## Recording Modes

Vowen offers two ways to record:

### Push-to-Talk (Default)

Hold your shortcut to record, release to transcribe. This is the fastest way to dictate short phrases and sentences.

| Platform | Default Shortcut |
| -------- | ---------------- |
| macOS    | `Fn`             |
| Windows  | `Ctrl + Shift`   |

<div style={{ marginTop: "-1rem" }}>
  You can pick your own shortcut. See different setups [here](/features/shortcut-patterns).
</div>

### Hands-Free

Toggle recording on or off without holding a key. Ideal for longer dictation sessions.

| Platform | Default Shortcut |
| -------- | ---------------- |
| macOS    | `Fn + Control`   |
| Windows  | `Ctrl + H`       |

<div style={{ marginTop: "-1rem" }}>
  You can pick your own shortcut. See different setups [here](/features/shortcut-patterns).
</div>

Hands-free recording can be stopped by:

* Pressing the shortcut again
* Clicking the stop button in the indicator
* Auto-deactivate on mouse or keystroke (optional setting)

All of these stop the recording and send your audio through transcription. To **cancel** instead, press `Escape` while recording. A notification appears with an **Undo** button for a few seconds in case you cancelled by mistake. Click Undo to resume the same session with the audio you already recorded. If the notification dismisses without action, the audio is discarded and nothing is transcribed.

<div className="my-6 flex gap-3 rounded-xl border border-violet-500/30 bg-violet-500/10 p-4">
  <div className="shrink-0 pt-0.5 text-violet-500 dark:text-violet-400">
    <svg width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" strokeWidth="2" strokeLinecap="round" strokeLinejoin="round">
      <path d="M15 14c.2-1 .7-1.7 1.5-2.5 1-.9 1.5-2.2 1.5-3.5A6 6 0 0 0 6 8c0 1 .2 2.2 1.5 3.5.7.7 1.3 1.5 1.5 2.5" />

      <path d="M9 18h6" />

      <path d="M10 22h4" />
    </svg>
  </div>

  <div className="text-sm leading-relaxed text-zinc-700 dark:text-zinc-300">
    <strong className="text-violet-700 dark:text-violet-300">Pro tip:</strong> beyond dictation, Vowen can run voice-triggered actions like compressing images, merging PDFs, opening apps, setting timers, and translating text. <a href="/ai-features/command-mode" className="font-semibold text-violet-600 hover:text-violet-500 dark:text-violet-400">See Command Mode →</a>
  </div>
</div>

## Recording Indicator

When you start recording, a small pill-shaped indicator appears on your screen showing that Vowen is listening.

<img src="https://mintcdn.com/vowen/2NLLQOm5sszam2ni/images/recording-indicator.png?fit=max&auto=format&n=2NLLQOm5sszam2ni&q=85&s=891e4f12c5c76aa9a84eb08f5a516e7d" alt="Recording indicator showing live transcription with pause and stop controls" width="602" height="220" data-path="images/recording-indicator.png" />

The indicator shows:

* **Active app icon:** which app will receive your transcription
* **Waveform animation:** pulses while you speak
* **Pause and stop controls:** appear in hands-free mode only

You can move the indicator to the **top** or **bottom** of your screen, or **hide** it entirely, from **Settings > Recording > Recording Indicator Position**. See the [Settings overview](/get-started/settings) for more configuration options.

## The Transcription Pipeline

```
┌──────────┐    ┌──────────────┐    ┌───────────────┐    ┌──────────┐
│ Record   │───>│ Transcribe   │───>│ AI Enhance    │───>│ Paste    │
│ Audio    │    │ (STT Model)  │    │ (Optional)    │    │ Text     │
└──────────┘    └──────────────┘    └───────────────┘    └──────────┘
```

<Steps>
  <Step title="Audio Capture">
    Your microphone captures audio while the shortcut is held. Voice Activity Detection (VAD) automatically removes silence for faster processing.
  </Step>

  <Step title="Transcription">
    The audio is sent to your chosen transcription model, either a [local model](/transcription/models#local-models-offline) running on your machine or a [cloud model](/transcription/models#cloud-models).
  </Step>

  <Step title="Post-Processing">
    Filler words are removed. Snippet replacements (Threads) are applied. Workflow triggers are checked.
  </Step>

  <Step title="AI Enhancement (if enabled)">
    The transcribed text is sent to your configured AI provider for grammar cleanup, formatting, and polish. You bring your own [API key](/ai-features/ai-setup) from any of 10+ supported providers; Vowen never charges you for AI usage.
  </Step>

  <Step title="Text Insertion">
    The final text is delivered to the focused field using either the **paste method** (default) or **direct insertion**. See [Text Insertion Methods](#text-insertion-methods) below for the difference.
  </Step>
</Steps>

## Text Insertion Methods

Vowen offers two ways to deliver the final transcription into the focused field. The choice matters for clipboard behaviour and for keyboard layouts.

### Paste method (default)

Vowen copies the transcription to your clipboard and simulates a `Cmd+V` (macOS) or `Ctrl+V` (Windows) keystroke to paste it. This is the fastest path and works well on standard QWERTY layouts.

Side effect: your original clipboard content is overwritten by the transcription. Enable **Restore clipboard after paste** in **Settings > General** to have Vowen save your prior clipboard contents and put them back after the paste completes.

### Direct insertion method

Vowen types each character of the transcription as if you were pressing the keys yourself. The clipboard is never touched, so whatever you had on it stays exactly as it was.

Use this method when:

* You use a non-QWERTY layout (AZERTY, QWERTZ, Dvorak, and others) where the paste keystroke does not map cleanly to the "V" key
* The target app blocks standard paste (some remote desktops, sandboxed terminals, virtual machines)
* You want clipboard preservation without enabling a separate setting

Switch methods anytime in **Settings > General > Text Insertion Method**. See the [Settings overview](/get-started/settings) for related options.

## Voice Activity Detection (VAD)

Vowen uses the Silero VAD model to detect speech in your recording. This:

* Removes silence before and after speech
* Reduces processing time for local models
* Prevents "hallucinations" on silent recordings (e.g., the model outputting "Thank you" when nothing was said)

VAD runs automatically. No configuration needed.

## Sound Effects

By default, Vowen plays a subtle sound when recording starts and stops. Disable this in **Settings > General > Sound Effects**.
