Configuration

Recording

hotkeyMode string

How the hotkey triggers recording. Pick one, it's not a life decision.

"pushToTalk" default

"toggle" — Press ⌥Space to start, press again to stop.

"pushToTalk" — Hold ⌥Space to record, release to stop.

Speech Engine

sttEngine string

Speech-to-text backend. Apple's is good enough for most people. Whisper if you're into that.

"apple" default

"apple" — On-device Apple Speech Recognition.

"whisper" — Runs whisper-cli as a subprocess. Requires brew install whisper-cpp + a GGML model.

whisperModelPath string

Absolute path to a Whisper GGML model file. Only used when sttEngine is "whisper".

"" default (empty)

Example: /opt/homebrew/share/whisper-cpp/models/ggml-base.en.bin

whisperLanguage string

Language for speech recognition. We support Portuguese and English. That's the menu.

"en" default

"auto" "en" "pt"

LLM Post-Processing

llmEnabled bool

Let an LLM clean up your transcriptions. Removes filler words, fixes grammar. Makes you sound smarter than you are.

false default

llmProvider string

Which LLM API to talk to. OpenAI-compatible works with Ollama and LM Studio too, so you probably want that.

"openai" default

"openai" — OpenAI-compatible API (also works with Ollama, LM Studio).

"anthropic" — Anthropic Messages API.

llmEndpoint string

Where to send LLM requests. Defaults to local Ollama because we assumed you're running it.

http://localhost:11434/v1/chat/completions default (Ollama)

llmModel string

Which model to use. Whatever you have installed, put it here.

"llama3.2" default

llmApiKey string

API key. Leave empty if you're running locally (Ollama doesn't care).

"" default (empty)

llmPrompt string

The system prompt that tells the LLM what to do with your transcription. The default one works fine. Change it if you think you know better.

Default: "Clean up the following voice transcription. Remove filler words, fix grammar and punctuation. Output ONLY the cleaned text."

Modes

codeMode bool

For the brave souls who dictate code. Preserves technical terms and formatting instead of "fixing" them.

false default

privacyMode bool

Tinfoil hat mode. Forces everything offline: switches to Whisper and kills the LLM. Nothing leaves your machine.

false default

UI

verboseOverlay bool

How much stuff to show on screen while recording.

false default

true — Full overlay with mic, audio levels, and streaming transcription.

false — Compact pill with mic and audio levels only.

Other

onboardingDone bool

Whether you've been through the onboarding. Set automatically, don't touch it.

false default

githubRepo string

GitHub repo for checking updates. You probably don't need to change this.

"" default (empty)

Format: "owner/repo"

Example config

{
  "hotkeyMode": "pushToTalk",
  "sttEngine": "apple",
  "whisperLanguage": "en",
  "whisperModelPath": "",
  "llmEnabled": true,
  "llmProvider": "openai",
  "llmEndpoint": "https://api.openai.com/v1/chat/completions",
  "llmModel": "gpt-4o-mini",
  "llmApiKey": "",
  "llmPrompt": "Clean up the following voice transcription...",
  "codeMode": false,
  "privacyMode": false,
  "verboseOverlay": false,
  "onboardingDone": true
}

Related files

~/.config/dict/config.json

Main configuration

~/.config/dict/dictionary.json

Custom word replacements (e.g. correct proper nouns)

~/.config/dict/snippets.json

Voice-triggered text snippets (say a phrase, expand to text)

/tmp/dict.log

Runtime log