100% offline, open-source, Voice Dictation app for macOS
  • Swift 93.5%
  • Shell 4.6%
  • Makefile 1.8%
  • Python 0.1%
Find a file
2026-04-23 10:58:24 -04:00
.remember/tmp fix: ignore synthetic F19 key in modifier combo handler 2026-04-07 21:54:59 -04:00
dmg-assets add dmg installer with custom background 2026-02-03 16:00:49 +05:30
scripts Stop installed app when launching dev build 2026-03-24 15:25:03 +05:30
speaktype fix/xcode-compiler-issues 2026-04-23 10:58:24 -04:00
speaktype.xcodeproj fix/xcode-compiler-issues 2026-04-23 10:58:24 -04:00
speaktypeTests Merge text enrichment and Ollama model manager into main 2026-04-23 10:41:18 -04:00
speaktypeUITests Add comprehensive testing suite and UI improvements 2026-01-12 17:08:34 +05:30
.editorconfig chore: add editor and git configuration 2026-01-07 03:37:08 +05:30
.gitattributes chore: add editor and git configuration 2026-01-07 03:37:08 +05:30
.gitignore remove extra docs 2026-02-03 16:58:26 +05:30
.swiftlint.yml chore: add SwiftLint configuration 2026-01-07 03:37:10 +05:30
AGENTS.md AGENTS.md file 2026-04-10 17:17:46 -04:00
CHANGELOG.md release: v1.0.29 2026-03-24 15:47:41 +05:30
CLAUDE.md Merge text enrichment and Ollama model manager into main 2026-04-23 10:41:18 -04:00
default.profraw revamp ui 2026-01-25 19:57:35 +05:30
image.png update img 2026-03-22 02:50:26 +05:30
LICENSE docs: add MIT License 2026-02-17 23:10:47 +05:30
Makefile Add dev app workflow and stabilize recorder pipeline 2026-03-12 02:59:04 +05:30
README.md Merge text enrichment and Ollama model manager into main 2026-04-23 10:41:18 -04:00
RELEASE.md fix: doc cleanup 2026-02-15 23:36:46 +05:30

SpeakType

SpeakType Icon

Fast Voice-to-Text for macOS

SpeakType app screenshot Download Swift Platform License

Press a hotkey, speak, and instantly paste text anywhere on your Mac.


What is SpeakType?

SpeakType is a privacy-first voice dictation tool for macOS. Audio transcription runs locally using OpenAI's Whisper AI model via WhisperKit, and optional text enrichment can run either through a local Ollama model or through OpenAI. Audio never leaves your Mac; when cloud enrichment is enabled, only transcript text is sent.

  • Privacy First - Audio stays local, with explicit Local vs Cloud enrichment modes
  • Lightning Fast - Optimized for Apple Silicon
  • Works Everywhere - Any app, any text field
  • Open Source - Audit every line of code yourself

Enrichment Modes

  • Raw - Paste the transcript as-is
  • Clean - Fix punctuation, casing, spacing, and obvious grammar
  • Polish - Improve readability while preserving intent
  • Email - Rewrite the dictation as a ready-to-send email body

Enrichment Providers

  • OpenAI (Cloud) - Uses the OpenAI Responses API with your own API key
  • Ollama (Local) - Uses a local Ollama model running at http://localhost:11434

Ollama Model Manager

  • Browse installed local Ollama models from Settings → AI
  • Download curated local models in-app, including Gemma 4, Qwen 3, and Llama 3.2 options
  • See exact installed disk usage plus family, parameter size, quantization, and context metadata
  • Run on-demand speed tests to measure tokens/sec on your Mac

Installation

Requirements

  • macOS 13.0+ (Ventura or newer)
  • Apple Silicon (M1+) recommended
  • 2GB available storage (for AI models)

Download

Download Latest Release

  1. Download SpeakType.dmg
  2. Drag SpeakType to Applications
  3. Grant Microphone + Accessibility + Documents Folder permissions
  4. Download an AI model from Settings → AI Models

Press fn to start dictating.

Build from Source

git clone https://github.com/karansinghgit/speaktype.git
cd speaktype
make build && make run

Usage

  1. Press hotkey (fn by default)
  2. Speak your text
  3. Release hotkey
  4. Text is optionally enriched
  5. Final text appears!

Tips:

  • Speak naturally - Whisper handles accents well
  • Say punctuation: "comma", "period", "question mark"
  • Best results with 3-10 second clips
  • Use Settings → AI to download or select your Ollama model before enabling local enrichment

Development

make build          # Build debug
make run            # Run app
make clean          # Clean build
make test           # Run tests
make dmg            # Create DMG installer

Current Issues

⚠️ When loading a model for the first time / switching to another model, there is a startup delay of 30-60 seconds.

So the first transcription will appear ultra slow, but it will go back to instantaneous dictation right after it's warmed up.

Project Structure

speaktype/
├── App/           # Entry point
├── Views/         # SwiftUI interface
├── Models/        # Data models
├── Services/      # Core functionality
├── Controllers/   # Window management
└── Resources/     # Assets & config

Tech Stack


Contributing

  1. Fork & clone
  2. Create a branch: git checkout -b feature/my-feature
  3. Make changes and run make lint
  4. Submit a PR

License

MIT License - see LICENSE for details.


Credits


Made with ❤️ for developers

*Privacy-first • Open Source *