100% offline, open-source, Voice Dictation app for macOS

Swift 93.5%
Shell 4.6%
Makefile 1.8%
Python 0.1%

Find a file

Alberto Hernandez 348696c1d3 fix/xcode-compiler-issues		2026-04-23 10:58:24 -04:00
.remember/tmp	fix: ignore synthetic F19 key in modifier combo handler	2026-04-07 21:54:59 -04:00
dmg-assets	add dmg installer with custom background	2026-02-03 16:00:49 +05:30
scripts	Stop installed app when launching dev build	2026-03-24 15:25:03 +05:30
speaktype	fix/xcode-compiler-issues	2026-04-23 10:58:24 -04:00
speaktype.xcodeproj	fix/xcode-compiler-issues	2026-04-23 10:58:24 -04:00
speaktypeTests	Merge text enrichment and Ollama model manager into main	2026-04-23 10:41:18 -04:00
speaktypeUITests	Add comprehensive testing suite and UI improvements	2026-01-12 17:08:34 +05:30
.editorconfig	chore: add editor and git configuration	2026-01-07 03:37:08 +05:30
.gitattributes	chore: add editor and git configuration	2026-01-07 03:37:08 +05:30
.gitignore	remove extra docs	2026-02-03 16:58:26 +05:30
.swiftlint.yml	chore: add SwiftLint configuration	2026-01-07 03:37:10 +05:30
AGENTS.md	AGENTS.md file	2026-04-10 17:17:46 -04:00
CHANGELOG.md	release: v1.0.29	2026-03-24 15:47:41 +05:30
CLAUDE.md	Merge text enrichment and Ollama model manager into main	2026-04-23 10:41:18 -04:00
default.profraw	revamp ui	2026-01-25 19:57:35 +05:30
image.png	update img	2026-03-22 02:50:26 +05:30
LICENSE	docs: add MIT License	2026-02-17 23:10:47 +05:30
Makefile	Add dev app workflow and stabilize recorder pipeline	2026-03-12 02:59:04 +05:30
README.md	Merge text enrichment and Ollama model manager into main	2026-04-23 10:41:18 -04:00
RELEASE.md	fix: doc cleanup	2026-02-15 23:36:46 +05:30

README.md

SpeakType

Fast Voice-to-Text for macOS

Press a hotkey, speak, and instantly paste text anywhere on your Mac.

What is SpeakType?

SpeakType is a privacy-first voice dictation tool for macOS. Audio transcription runs locally using OpenAI's Whisper AI model via WhisperKit, and optional text enrichment can run either through a local Ollama model or through OpenAI. Audio never leaves your Mac; when cloud enrichment is enabled, only transcript text is sent.

Privacy First - Audio stays local, with explicit Local vs Cloud enrichment modes
Lightning Fast - Optimized for Apple Silicon
Works Everywhere - Any app, any text field
Open Source - Audit every line of code yourself

Enrichment Modes

Raw - Paste the transcript as-is
Clean - Fix punctuation, casing, spacing, and obvious grammar
Polish - Improve readability while preserving intent
Email - Rewrite the dictation as a ready-to-send email body

Enrichment Providers

OpenAI (Cloud) - Uses the OpenAI Responses API with your own API key
Ollama (Local) - Uses a local Ollama model running at http://localhost:11434

Ollama Model Manager

Browse installed local Ollama models from Settings → AI
Download curated local models in-app, including Gemma 4, Qwen 3, and Llama 3.2 options
See exact installed disk usage plus family, parameter size, quantization, and context metadata
Run on-demand speed tests to measure tokens/sec on your Mac

Installation

Requirements

macOS 13.0+ (Ventura or newer)
Apple Silicon (M1+) recommended
2GB available storage (for AI models)

Download

Download Latest Release

Download SpeakType.dmg
Drag SpeakType to Applications
Grant Microphone + Accessibility + Documents Folder permissions
Download an AI model from Settings → AI Models

Press fn to start dictating.

Build from Source

git clone https://github.com/karansinghgit/speaktype.git
cd speaktype
make build && make run

Usage

Press hotkey (fn by default)
Speak your text
Release hotkey
Text is optionally enriched
Final text appears!

Tips:

Speak naturally - Whisper handles accents well
Say punctuation: "comma", "period", "question mark"
Best results with 3-10 second clips
Use Settings → AI to download or select your Ollama model before enabling local enrichment

Development

make build          # Build debug
make run            # Run app
make clean          # Clean build
make test           # Run tests
make dmg            # Create DMG installer

Current Issues

⚠️ When loading a model for the first time / switching to another model, there is a startup delay of 30-60 seconds.

So the first transcription will appear ultra slow, but it will go back to instantaneous dictation right after it's warmed up.

Project Structure

speaktype/
├── App/           # Entry point
├── Views/         # SwiftUI interface
├── Models/        # Data models
├── Services/      # Core functionality
├── Controllers/   # Window management
└── Resources/     # Assets & config

Tech Stack

Swift 5.9+ / SwiftUI + AppKit
WhisperKit - Local Whisper inference
KeyboardShortcuts - Global hotkeys
AVFoundation - Audio capture

Contributing

Fork & clone
Create a branch: git checkout -b feature/my-feature
Make changes and run make lint
Submit a PR

License

MIT License - see LICENSE for details.

Credits

WhisperKit by Argmax
OpenAI Whisper

Made with ❤️ for developers

*Privacy-first • Open Source *