100% offline, open-source, Voice Dictation app for macOS
- Swift 93.5%
- Shell 4.6%
- Makefile 1.8%
- Python 0.1%
| .remember/tmp | ||
| dmg-assets | ||
| scripts | ||
| speaktype | ||
| speaktype.xcodeproj | ||
| speaktypeTests | ||
| speaktypeUITests | ||
| .editorconfig | ||
| .gitattributes | ||
| .gitignore | ||
| .swiftlint.yml | ||
| AGENTS.md | ||
| CHANGELOG.md | ||
| CLAUDE.md | ||
| default.profraw | ||
| image.png | ||
| LICENSE | ||
| Makefile | ||
| README.md | ||
| RELEASE.md | ||
SpeakType
What is SpeakType?
SpeakType is a privacy-first voice dictation tool for macOS. Audio transcription runs locally using OpenAI's Whisper AI model via WhisperKit, and optional text enrichment can run either through a local Ollama model or through OpenAI. Audio never leaves your Mac; when cloud enrichment is enabled, only transcript text is sent.
- Privacy First - Audio stays local, with explicit Local vs Cloud enrichment modes
- Lightning Fast - Optimized for Apple Silicon
- Works Everywhere - Any app, any text field
- Open Source - Audit every line of code yourself
Enrichment Modes
Raw- Paste the transcript as-isClean- Fix punctuation, casing, spacing, and obvious grammarPolish- Improve readability while preserving intentEmail- Rewrite the dictation as a ready-to-send email body
Enrichment Providers
- OpenAI (Cloud) - Uses the OpenAI Responses API with your own API key
- Ollama (Local) - Uses a local Ollama model running at
http://localhost:11434
Ollama Model Manager
- Browse installed local Ollama models from Settings → AI
- Download curated local models in-app, including Gemma 4, Qwen 3, and Llama 3.2 options
- See exact installed disk usage plus family, parameter size, quantization, and context metadata
- Run on-demand speed tests to measure tokens/sec on your Mac
Installation
Requirements
- macOS 13.0+ (Ventura or newer)
- Apple Silicon (M1+) recommended
- 2GB available storage (for AI models)
Download
- Download
SpeakType.dmg - Drag SpeakType to Applications
- Grant Microphone + Accessibility + Documents Folder permissions
- Download an AI model from Settings → AI Models
Press fn to start dictating.
Build from Source
git clone https://github.com/karansinghgit/speaktype.git
cd speaktype
make build && make run
Usage
- Press hotkey (
fnby default) - Speak your text
- Release hotkey
- Text is optionally enriched
- Final text appears!
Tips:
- Speak naturally - Whisper handles accents well
- Say punctuation: "comma", "period", "question mark"
- Best results with 3-10 second clips
- Use Settings → AI to download or select your Ollama model before enabling local enrichment
Development
make build # Build debug
make run # Run app
make clean # Clean build
make test # Run tests
make dmg # Create DMG installer
Current Issues
⚠️ When loading a model for the first time / switching to another model, there is a startup delay of 30-60 seconds.
So the first transcription will appear ultra slow, but it will go back to instantaneous dictation right after it's warmed up.
Project Structure
speaktype/
├── App/ # Entry point
├── Views/ # SwiftUI interface
├── Models/ # Data models
├── Services/ # Core functionality
├── Controllers/ # Window management
└── Resources/ # Assets & config
Tech Stack
- Swift 5.9+ / SwiftUI + AppKit
- WhisperKit - Local Whisper inference
- KeyboardShortcuts - Global hotkeys
- AVFoundation - Audio capture
Contributing
- Fork & clone
- Create a branch:
git checkout -b feature/my-feature - Make changes and run
make lint - Submit a PR
License
MIT License - see LICENSE for details.
Credits
- WhisperKit by Argmax
- OpenAI Whisper
Made with ❤️ for developers
*Privacy-first • Open Source *
