Vowen supports multiple transcription engines — both local (offline) and cloud-based. This guide helps you pick the right one.Documentation Index
Fetch the complete documentation index at: https://vowen.mintlify.app/llms.txt
Use this file to discover all available pages before exploring further.
Quick Recommendations
| Use Case | Recommended Model | Why |
|---|---|---|
| Quick notes (macOS) | Base.en or Parakeet | Fast, good accuracy, works offline |
| Quick notes (Windows) | Groq Whisper Turbo | Fast cloud model, free tier |
| Professional writing | Large v3 Turbo + AI Enhancement | Best accuracy + polished output |
| Non-English | Large v3 or Groq Large v3 | Best multilingual accuracy |
| Maximum privacy | Any local model | Nothing leaves your device |
| Real-time preview | Parakeet, Deepgram, or Soniox | Shows text as you speak |
| Meetings (diarization) | Deepgram Nova 3 or AssemblyAI | Identifies who said what |
Local Models (Offline)
These run entirely on your machine. No internet required. All local models are downloaded on demand from within the app; nothing is bundled with the installer. The Base model is the default and is offered for download during onboarding.Whisper Models
Based on OpenAI’s Whisper, supporting 99 languages.| Model | Size | Speed | Accuracy | Best For |
|---|---|---|---|---|
| Tiny | 78 MB | Fastest | Basic | Quick notes, testing |
| Tiny.en | 78 MB | Fastest | Good (English) | Fast English dictation |
| Base.en | 148 MB | Fast | Good | General English use |
| Base (default) | 148 MB | Fast | Good | Multilingual basics |
| Small | 488 MB | Medium | Great | Professional work |
| Small.en | 488 MB | Medium | Great (English) | Detailed English |
| Medium | 1.5 GB | Slow | Excellent | High-quality output |
| Medium.en | 1.5 GB | Slow | Excellent (English) | Long-form English |
| Large v3 | 3 GB | Slowest | Best | Maximum accuracy |
| Large v3 Turbo | 1.6 GB | Medium-Fast | Excellent | Best balance |
Models with
.en suffix are English-only and slightly more accurate for English than their multilingual counterparts.Parakeet TDT 0.6B
NVIDIA’s streaming-capable model. Supports 25 European languages with auto-detection.| macOS | Windows | |
|---|---|---|
| Format | CoreML | ONNX (int8) |
| Size | ~1 GB | ~478 MB |
| Streaming | Yes | Yes |
| Languages | 25 | 25 |
Cloud Models
These send audio to a third-party API. Require an internet connection and API key.Available Cloud Models
With cloud models, your audio is sent to the provider, processed, and the transcription is returned. Some models stream text back as you speak (Real-time: Yes); others return the full transcript once you stop (Real-time: No).| Model | Provider | Languages | Real-time | Diarization | Free Tier |
|---|---|---|---|---|---|
| Whisper Large v3 | Groq | 99 | No | No | Yes (generous) |
| Whisper Turbo | Groq | 99 | No | No | Yes (generous) |
| Nova 2/3 | Deepgram | 99 | Yes | Yes | $200 credit |
| Scribe v2 | ElevenLabs | 99 | Yes | Yes | Limited |
| Universal | AssemblyAI | 6 streaming / 99 batch | Yes | Yes | $50 credit |
| Voxtral Mini | Mistral | 13 | Yes | Yes | Free API |
| Saaras v3 | Sarvam AI | 22+ | Yes | No | Limited |
| Soniox | Soniox | Various | Yes | Yes | Paid |
| Aurora | XAI | Various | Yes | Yes | Limited |
| Speechmatics | Speechmatics | 39 | Yes | Yes | Paid |
AssemblyAI Universal supports 6 languages in real-time streaming (English, Spanish, French, German, Italian, Portuguese) and 99 languages when used for batch transcription of pre-recorded files.
Setting Up Cloud Models
- Go to Settings > Models
- Select a cloud model from the list
- Enter your API key when prompted
- The model is ready to use immediately
Pro tip: Groq is the most popular choice among Vowen users. It’s fast, accurate, and has a generous free tier that covers most daily use.
GPU Acceleration (Windows)
If you have an NVIDIA GPU, you can dramatically speed up local model transcription:- Go to Settings > Models
- Scroll down to find “GPU Acceleration”
- Download the CUDA acceleration module
- Restart Vowen (or your system if needed)
Choosing Between Local and Cloud
| Factor | Local | Cloud |
|---|---|---|
| Privacy | Data never leaves device | Audio sent to provider |
| Speed (macOS) | Fast for small/medium models | Fast always |
| Speed (Windows) | Slow without GPU | Fast always |
| Accuracy | Good to excellent | Excellent |
| Internet | Not required | Required |
| Cost | Free | Free tier or paid API |
| Languages | 99 (Whisper) / 25 (Parakeet) | Varies by provider |