VoiceFlow — Voice-to-Text That Never Leaves Your Mac

Features

No cloud required.
No corners cut.

Other dictation tools send your voice to remote servers. VoiceFlow was built from scratch in Rust and Swift to prove you don't have to sacrifice privacy for quality.

Hold-to-Record

Hold ⌥ Space anywhere on your Mac to record. Release to transcribe and paste instantly. No app switching, no extra steps.

Zero Data Leaves Your Mac

Every AI model runs on your hardware. Your voice is processed locally and never recorded, stored, or transmitted. Not even anonymized telemetry.

370 ms End-to-End

A Rust-powered pipeline transcribes, formats, and pastes in under four-tenths of a second. No network round-trips — just raw local speed.

Smart Formatting

A local LLM cleans up grammar, adds punctuation, and removes filler words. Post-processing normalizes numbers ($50,000), percentages (25%), times (3:30 PM), phone numbers, dates, and abbreviations.

Say It Again to Fix It

Re-say a sentence — or just the word you got wrong — and VoiceFlow replaces your last dictation instead of appending. Correction cues like "scratch that" and a local LLM decide redo vs. new; hold ⌥⇧ Space to force a replace or speak an edit.

App-Aware Context

Detects your frontmost app and adjusts tone automatically. Professional for email, casual for Slack, technical for code editors. Custom prompts per app.

Visual Context (VLM)

A local vision model reads your screen to extract names, project terms, and writing context — so proper nouns are spelled correctly without a cloud lookup.

Learns Your Corrections

Fix a word once and VoiceFlow remembers. It monitors your edits and applies learned corrections to future dictations automatically.

Voice Snippets

Say "my signature" and it expands to your full sign-off. Create unlimited custom trigger phrases that expand into any text you want.

Summarize This

Say "summarize this" and a local LLM reads the current text field and appends a bullet-point summary. A live status pill shows progress through each step.

Accuracy

Cloud-grade accuracy, without the cloud

Multiple AI models run in tandem on your Mac: speech-to-text captures every word, a language model formats it, and a vision model reads your screen for context. The same quality you'd expect from a cloud API.

Strips filler words, false starts, and verbal corrections
Numbers, currency, times, percentages, and phone numbers formatted automatically
Per-persona vocabulary lists bias the LLM toward your domain terms and proper nouns
Vision model reads your screen to spell names and terms correctly
Learns from your corrections and improves over time

Raw speech

um so the budget is like fifty thousand dollars and we need it done by january fifteenth uh that's about ninety five percent of what we asked for period

✓ Formatted output

The budget is $50,000 and we need it done by January 15. That's about 95% of what we asked for.

Filler words removed Currency formatted Date normalized Percentage converted Punctuation added

Customizable

Your workflow, your rules

Unlike cloud dictation tools that give you a single mode, VoiceFlow lets you control everything — AI models, visual context, per-app formatting, and more.

Toggle visual context (VLM) for screen-aware dictation
Edit per-persona vocabulary lists with a chip-style tag picker
Swap AI models to balance speed vs. quality
Customize formatting prompts per app
Create voice snippets, manage corrections, and tune spacing

Active Persona Software Engineer

Vocabulary Bias 42 terms

Formatting Level Moderate

Spacing Mode Context-Aware

Voice Commands

Visual Context (VLM)

Correction Learning

Speech-to-Text Parakeet 0.6B

Formatting LLM Bonsai 8B

Performance

Near-instant by design

VoiceFlow runs a 16K-token context window with the persona, vocabulary, on-screen text, and formatting rules permanently warm in the prompt prefix. Every dictation re-uses that cache — only your newly-spoken words get evaluated. The chart below shows real prompt-eval latency from this build: even long utterances stay well under a second.

~370ms

Typical end-to-end
(speaking → typed text)

96ms

AI formatting
(with memorized settings)

147ms

Voice recognition
(entirely on-device)

~12K

Words of context held in memory
(persona, vocabulary, rules)

Processing time grows slowly with dictation length

Real measurements from 31 consecutive dictations on Apple Silicon (v2.0.1). VoiceFlow keeps your persona, vocabulary, and formatting rules memorized in advance — so the only time spent processing is the words you actually said.

The first dictation after launch takes ~5 seconds while VoiceFlow memorizes your settings once. Every dictation after that runs in well under a second, from the moment you stop speaking to formatted text in your app. The AI generates roughly 100 words per second; voice recognition completes in about 150 milliseconds. Everything happens entirely on your Mac.

Time reclaimed

Speak it. Save days.

Words leave your mouth at ~140 wpm. They leave your fingers at ~40 wpm. VoiceFlow adds 370 ms of processing per dictation — that's it. The gap compounds into entire weeks of your life every year, formatted and pasted with startling accuracy.

3.5×

Faster than typing
(140 vs 40 wpm)

0.37s

From voice to typed text
(formatted, ready to use)

4.5 days

Saved every year
at 1,000 words/day

22 days

Saved every year
at 5,000 words/day

Days reclaimed per year, by daily dictation volume

Calculated as (words/day ÷ 40 wpm typing) minus (words/day ÷ 140 wpm speaking), annualized over 365 days. VoiceFlow's 370 ms per-utterance processing is fixed and doesn't materially affect the math.

1,000 words a day is one focused email or a page of notes — dictate that volume and VoiceFlow gives you back nearly a working week every year. Heavy writers reclaim several. Every word is transcribed, punctuated, formatted, and pasted on-device with no cloud round-trip in the loop.

Redesigned in v2.0

Every dictation, every day,
at a glance

VoiceFlow v2.0 ships a complete settings rebuild on a new "Liquid Glass" design system, paired with the Parakeet + Bonsai model defaults. The new Insights dashboard tracks your dictation pace, the apps you dictate into most, and your daily activity streak. Personas carry editable vocabulary lists that bias the LLM toward your domain terms — everything is computed and stored on-device.

VoiceFlow Insights dashboard showing words-per-minute, total words dictated, app usage breakdown, and a multi-week activity streak

Insights, personas, and full control

A smile-shaped speed gauge labels your dictation pace as Steady, Fast, or Top. Per-app bars show where you dictate most. A streak heatmap tracks every active day. Built-in personas come seeded with domain vocabulary — Software Engineer ships with kubectl, Postgres, Terraform, gRPC and more, and you can add your own with one click. Every number is computed locally; nothing syncs anywhere.

Comparison

The dictation tool that
doesn't compromise

See how VoiceFlow stacks up against popular cloud-based alternatives.

	VoiceFlow	Wispr Flow	macOS Dictation
Price	Free forever	$12–15/mo	Free (built-in)
Privacy	100% on-device	Cloud-processed	Partial — sends to Apple
Internet required	Never	Always	For enhanced mode
Smart formatting	Local LLM	Cloud AI	No
App-aware context	Email, Slack, code	Email, Slack, code	No
Voice punctuation	Full command set	Yes	Basic
Voice snippets	Yes	Yes	No
Visual context (VLM)	Free — local VLM	Pro plan only	No
Correction learning	Learns from edits	Auto-learns	No
Number formatting	Currency, %, time, dates	Yes	No
Summarize text	Voice-triggered	No	No
Custom AI models	Swap anytime	Locked to vendor	No
Open-source	Fully auditable	Closed-source	Closed-source
Data collection	None	Voice sent to cloud	Voice sent to Apple

Competitor information based on publicly available documentation as of February 2026.

Start dictating
within minutes

Download VoiceFlow, run the setup wizard, pick your AI models, and you're ready. No account. No API key. No subscription. Free forever.

Download for macOS

Requires macOS 15 Sequoia or later · Apple Silicon (M1 or newer) · 16 GB RAM recommended

Already have VoiceFlow? This release adds automatic updates — re-download once, and future versions install themselves in the background.

|

Lightning fast

Startlingly accurate

Fully private

Forever free

No cloud required.
No corners cut.

Hold-to-Record

Zero Data Leaves Your Mac

370 ms End-to-End

Smart Formatting

Say It Again to Fix It

App-Aware Context

Visual Context (VLM)

Learns Your Corrections

Voice Snippets

Summarize This

Cloud-grade accuracy, without the cloud

Your workflow, your rules

Near-instant by design

Speak it. Save days.

Every dictation, every day,
at a glance

Insights, personas, and full control

The dictation tool that
doesn't compromise

Start dictating
within minutes

|

Lightning fast

Startlingly accurate

Fully private

Forever free

No cloud required.No corners cut.

Hold-to-Record

Zero Data Leaves Your Mac

370 ms End-to-End

Smart Formatting

Say It Again to Fix It

App-Aware Context

Visual Context (VLM)

Learns Your Corrections

Voice Snippets

Summarize This

Cloud-grade accuracy, without the cloud

Your workflow, your rules

Near-instant by design

Speak it. Save days.

Every dictation, every day,at a glance

Insights, personas, and full control

The dictation tool thatdoesn't compromise

Start dictatingwithin minutes

No cloud required.
No corners cut.

Every dictation, every day,
at a glance

The dictation tool that
doesn't compromise

Start dictating
within minutes