Skip to main content

Documentation Index

Fetch the complete documentation index at: https://vowen.mintlify.app/llms.txt

Use this file to discover all available pages before exploring further.

Vowen supports multiple transcription engines — both local (offline) and cloud-based. This guide helps you pick the right one.

Quick Recommendations

Use CaseRecommended ModelWhy
Quick notes (macOS)Base.en or ParakeetFast, good accuracy, works offline
Quick notes (Windows)Groq Whisper TurboFast cloud model, free tier
Professional writingLarge v3 Turbo + AI EnhancementBest accuracy + polished output
Non-EnglishLarge v3 or Groq Large v3Best multilingual accuracy
Maximum privacyAny local modelNothing leaves your device
Real-time previewParakeet, Deepgram, or SonioxShows text as you speak
Meetings (diarization)Deepgram Nova 3 or AssemblyAIIdentifies who said what

Local Models (Offline)

These run entirely on your machine. No internet required. All local models are downloaded on demand from within the app; nothing is bundled with the installer. The Base model is the default and is offered for download during onboarding.

Whisper Models

Based on OpenAI’s Whisper, supporting 99 languages.
ModelSizeSpeedAccuracyBest For
Tiny78 MBFastestBasicQuick notes, testing
Tiny.en78 MBFastestGood (English)Fast English dictation
Base.en148 MBFastGoodGeneral English use
Base (default)148 MBFastGoodMultilingual basics
Small488 MBMediumGreatProfessional work
Small.en488 MBMediumGreat (English)Detailed English
Medium1.5 GBSlowExcellentHigh-quality output
Medium.en1.5 GBSlowExcellent (English)Long-form English
Large v33 GBSlowestBestMaximum accuracy
Large v3 Turbo1.6 GBMedium-FastExcellentBest balance
Models with .en suffix are English-only and slightly more accurate for English than their multilingual counterparts.

Parakeet TDT 0.6B

NVIDIA’s streaming-capable model. Supports 25 European languages with auto-detection.
macOSWindows
FormatCoreMLONNX (int8)
Size~1 GB~478 MB
StreamingYesYes
Languages2525
Parakeet is excellent for real-time transcription preview and European languages. The first 1-2 transcriptions after launch may be slower as the model loads into memory.

Cloud Models

These send audio to a third-party API. Require an internet connection and API key.

Available Cloud Models

With cloud models, your audio is sent to the provider, processed, and the transcription is returned. Some models stream text back as you speak (Real-time: Yes); others return the full transcript once you stop (Real-time: No).
ModelProviderLanguagesReal-timeDiarizationFree Tier
Whisper Large v3Groq99NoNoYes (generous)
Whisper TurboGroq99NoNoYes (generous)
Nova 2/3Deepgram99YesYes$200 credit
Scribe v2ElevenLabs99YesYesLimited
UniversalAssemblyAI6 streaming / 99 batchYesYes$50 credit
Voxtral MiniMistral13YesYesFree API
Saaras v3Sarvam AI22+YesNoLimited
SonioxSonioxVariousYesYesPaid
AuroraXAIVariousYesYesLimited
SpeechmaticsSpeechmatics39YesYesPaid
AssemblyAI Universal supports 6 languages in real-time streaming (English, Spanish, French, German, Italian, Portuguese) and 99 languages when used for batch transcription of pre-recorded files.

Setting Up Cloud Models

  1. Go to Settings > Models
  2. Select a cloud model from the list
  3. Enter your API key when prompted
  4. The model is ready to use immediately
Pro tip: Groq is the most popular choice among Vowen users. It’s fast, accurate, and has a generous free tier that covers most daily use.

GPU Acceleration (Windows)

If you have an NVIDIA GPU, you can dramatically speed up local model transcription:
  1. Go to Settings > Models
  2. Scroll down to find “GPU Acceleration”
  3. Download the CUDA acceleration module
  4. Restart Vowen (or your system if needed)
With GPU acceleration, even the Large v3 model responds in 1-2 seconds on modern NVIDIA GPUs.

Choosing Between Local and Cloud

FactorLocalCloud
PrivacyData never leaves deviceAudio sent to provider
Speed (macOS)Fast for small/medium modelsFast always
Speed (Windows)Slow without GPUFast always
AccuracyGood to excellentExcellent
InternetNot requiredRequired
CostFreeFree tier or paid API
Languages99 (Whisper) / 25 (Parakeet)Varies by provider