Push-to-talk dictation that types straight into any Mac app — on-device, no account, no subscription.
Hold Option plus Space⌥Space, speak, release — your words land at the cursor in Notes, Slack, Mail, Notion, ChatGPT, or any Mac app, transcribed offline on the Apple Neural Engine.
Powered by OpenAI Whisper (via WhisperKit) and Parakeet (via FluidAudio), on the Apple Neural Engine.Read the source code
See it work
Speech to text in any Mac app.
Hold the hotkey. Speak a full sentence. Release. Your voice is transcribed and typed at the cursor — in an email, a Slack thread, a Notion page, a Google Doc, ChatGPT, or a code editor — without ever leaving your Mac.
Hold Option plus Space⌥Space to dictate
0:00Release to finish · Esc cancels
Hold Option plus Space⌥Space, speak, release — words appear at the cursor. Nothing leaves the Mac.
Three steps
How to use voice to text on Mac and MacBook.
01
Hold Option plus Space⌥Space
Hold Option plus Space⌥Space, speak, release — words appear at the cursor. Nothing leaves the Mac.
02
Speak naturally
Full sentences, paragraphs, casual or technical. Punctuation is inferred; you can add it yourself if you prefer.
03
Release — words are typed
Let go of the keys. Your text is typed straight into the focused app at the cursor.
What you get
Why VoiceToText for Mac speech to text?
Everything dictation should be on a Mac — and nothing it shouldn’t. Free, offline by default, no account, no telemetry.
Your audio never leaves the Mac
Local models run on-device by default. Zero network calls. No accounts, no servers, no telemetry.
Hold to talk. Release to insert.
Press Option plus Space⌥Space, speak, let go. Text appears at the cursor. No toggles, no accidental cutoffs, no mode to escape.
Instant start, minimal battery
Pure SwiftUI, menu-bar only. Cold-starts in under a second and barely registers in Activity Monitor — no Electron overhead.
Faster than typing — in any app
A long Slack reply, a thoughtful email, a meeting note, a search query, an AI prompt — talking is faster than typing. Hold, speak, release.
Switch engines without switching apps
Parakeet and Whisper run on-device by default. Add an OpenAI key to unlock cloud models. One setting, no reinstall.
Works in every app that accepts text
Slack, Mail, Notes, browser address bars, terminals, code editors — if macOS puts a cursor there, VoiceToText types into it.
Optional
Need higher accuracy? Add an OpenAI key and pick a cloud model.
Local models run by default — no key required. Paste an OpenAI API key in Settings → Cloud to unlock the models below. Audio only leaves your Mac when a cloud model is active.
GPT-4o Transcribe
Best accuracy available. Use it when correctness matters more than cost — accents, technical jargon, 99+ languages.
GPT-4o Mini Transcribe
The everyday cloud pick. Accuracy close to GPT-4o Transcribe at a fraction of the cost — good default for high-volume use.
Whisper-1
OpenAI’s original hosted model. Lowest cost per minute — fine for straightforward dictation in major languages.
Bring your own OpenAI API key — stored in your Mac’s Keychain, never sent anywhere else.
Audio only leaves your Mac when a cloud model is the active engine. Local models never touch the network.
Swap back to Parakeet or Whisper-large at any time. No account, no lock-in.
Real-time
Watch your words appear as you speak.
Pick a streaming engine — Scribe v2 Realtime (ElevenLabs) or GPT-4o Transcribe Realtime (OpenAI) — and your speech is transcribed live, word by word, then pasted at the cursor when you finish. No waiting for the whole clip.
Press Option plus Space⌥Space to dictate
Listening
0:00Press again to finish · Esc cancels
Words stream in live as you speak, then paste at the cursor in any Mac app. Streaming sends audio to your chosen provider under your own key.
Use cases
Speak in every Mac app — from Notes and Slack to ChatGPT and Cursor.
Whatever app is in focus gets your words at the cursor. Writing an email, drafting a Slack reply, capturing a thought in Apple Notes, prompting ChatGPT, refactoring with Cursor — same hotkey, same speed.
A long Slack thread, a meeting note in Apple Notes, a draft in Notion, a Gmail reply, a search in your browser, a ChatGPT question, a Cursor refactor — same gesture every time. Hold to talk, release to insert. Your voice never leaves the Mac in local mode.
Apple Notes
Slack
Messages
Mail
Gmail
Notion
Obsidian
Pages
Google Docs
Safari
Chrome
ChatGPT
Claude.ai
Cursor
Claude Code
VS Code
Warp
Terminal
Slack — #design⌥Space
▌ running late to standup — just
▌ finishing a customer call. start
▌ without me, I'll catch up after.
Spoken into Slack in one push. No window switch, no copy-paste.
VoiceToText vs. Wispr Flow, Superwhisper, MacWhisper, and Apple Dictation.
Same core idea — speech to text into any Mac app. Different terms: permanently free, no subscription, no account, offline by default, and source code you can read and audit.
Feature comparison between VoiceToText, Wispr Flow, Superwhisper, MacWhisper, and Apple Dictation
Voice to text on Mac — frequently asked questions.
Answers to what developers, writers, and privacy-conscious users ask before installing this speech to text Mac app.
Is VoiceToText free?
Yes — completely free, forever. VoiceToText is open source (OSI-approved license) with no paid tiers, no accounts, and no in-app purchases. Download it free.
Does it work offline? Is my voice data sent anywhere?
Yes, it works fully offline by default. Local models (Whisper, Parakeet) run on the Apple Neural Engine — audio never leaves your Mac and the app makes zero network calls. Cloud models (OpenAI GPT-4o Transcribe, etc.) are strictly opt-in: audio is sent directly to OpenAI under your own API key only when you explicitly select a cloud engine. VoiceToText itself never receives your audio.
How do I use voice to text on my Mac?
Install from the DMG, grant Microphone and Accessibility permissions, then hold Option plus Space⌥Space in any app, speak, and release. Text is typed at the cursor — no panel, no copy-paste. Works identically on MacBook Air, MacBook Pro, iMac, Mac mini, and Mac Studio (M1 or newer).
Why does it need Accessibility permission?
Accessibility permission is how macOS lets one app type into another. VoiceToText uses it solely to inject transcribed text at the cursor — it does not read your screen, monitor keystrokes, or access any other app’s data. The source code is public if you want to verify exactly how the permission is used.
What are the system requirements?
macOS 14 Sonoma or later, and an Apple Silicon Mac (M1 or newer). Intel Macs are not supported because the local models rely on the Apple Neural Engine. Cloud models work on any supported Mac with an internet connection.
How accurate is it? Which models does it use?
Local accuracy is near OpenAI Whisper quality for English; Parakeet is faster for English and Whisper-large is the best local option for 99 languages. For the highest accuracy across accents and technical jargon, bring your own OpenAI API key (stored in the macOS Keychain) and switch to GPT-4o Transcribe, GPT-4o Mini Transcribe, or Whisper-1 in Settings.
How is VoiceToText different from Apple Dictation, Apple Intelligence, or Wispr Flow?
Apple Dictation is toggle-based and tied to Apple’s servers. Apple Intelligence writing tools rewrite text after the fact — they are not real-time dictation at the cursor. Wispr Flow is a paid subscription that always sends audio to the cloud. VoiceToText is free, open source, push-to-talk, on-device by default, and lets you bring your own OpenAI key when you want maximum accuracy. See the full comparison.
What languages are supported?
WhisperKit supports 99 languages out of the box — the same coverage as OpenAI’s Whisper model. Parakeet (FluidAudio) is English-only but faster for English speakers. If you dictate in a non-English language, select a Whisper model in Settings → Models.
What apps does VoiceToText work in?
Any Mac app with a text field. Apple Notes, Notion, Obsidian, Bear, Pages, Google Docs, Microsoft Word, Slack, Messages, Mail, Gmail, Outlook, WhatsApp, Discord, Safari and Chrome address bars, ChatGPT, Claude.ai — if macOS puts a cursor there, your voice types into it. No app-specific setup.
Can I dictate into Claude Code, Cursor, or other AI coding tools?
Yes. VoiceToText types into whatever app has focus — Claude Code, Codex CLI, Cursor, Copilot Chat, ChatGPT, any terminal, any editor. Hold the hotkey, speak your prompt, release. No switching windows, no copy-paste.
Do you collect any usage data or telemetry?
No. No accounts, no analytics, no first-party servers. The app makes zero network calls with a local model. If you opt in to a cloud model, audio goes directly from your Mac to OpenAI — VoiceToText is never in that path. The repo is public; verify it yourself or watch traffic with Little Snitch.
Ready to dictate
Download VoiceToText — voice to text for macOS.
One DMG. Drag to Applications. Grant Microphone and Accessibility. Hold Option plus Space⌥Space and speak.
Drag VoiceToText to /Applications. Takes 5 seconds.
2
Launch the app
It lives in the menu bar.
3
Grant Microphone and Accessibility
One-time prompt — mic hears you, accessibility types into whatever app you’re in. Revoke anytime in System Settings.
4
Hold Option plus Space⌥Space, speak, release.
Your words are typed at the cursor.
Why two permissions? Microphone lets the app hear you. Accessibility lets it type into whatever app you’re in. Both stay on-device. Revoke anytime in System Settings.
Requirements: macOS 14 Sonoma or later · Apple Silicon (M1 or newer).