Speak2 | Moonquake Media

1. Getting Started
2. Dictation
3. Live Transcription
4. Speech Models
5. Personal Dictionary
6. AI Text Refinement
7. Transcription History
8. Hotkey Configuration
9. Troubleshooting

1. Getting Started

Welcome

Speak2 is a macOS dictation app that lives in your menu bar. It uses on-device AI models to convert your speech into text. No internet connection is required. Your voice data never leaves your Mac.

The workflow is simple: press a hotkey to start recording, release the hotkey to stop, and Speak2 transcribes your speech and pastes the text into whatever app you are using. It works in any text field across macOS.

System Requirements

Processor: Apple Silicon (M1 or later)
Operating System: macOS
Permissions: Accessibility permission and Microphone permission (both required)

Apple Silicon Required

Speak2 requires a Mac with an Apple Silicon processor (M1, M2, M3, or later). It will not run on Intel-based Macs.

Setup Wizard

When you open Speak2 for the first time, a setup window appears. The setup has three steps.

Step 1 — Permissions

Speak2 requires two macOS permissions to function:

Accessibility — This permission allows Speak2 to detect your global hotkey and paste transcribed text into other apps. A status indicator shows whether the permission is granted. Tap Grant to open System Settings and enable access.
Microphone — This permission allows Speak2 to record your voice. A status indicator shows whether the permission is granted. Tap Grant to allow microphone access.

Step 2 — Download a Speech Model

Choose a transcription model to download. Speak2 supports several model options:

Model	Size	Notes
Whisper tiny.en	~75 MB	Smallest and fastest. English only.
Whisper base.en	~140 MB	Good balance of speed and accuracy. English only. Recommended starting model.
Whisper small.en	~460 MB	Higher accuracy. English only.
Whisper large-v3	~3 GB	Highest accuracy. Supports many languages.
Whisper large-v3-turbo	~954 MB	Fast and accurate. Supports many languages.
Parakeet v3	~600 MB	Supports 25 languages.

You can download additional models later from the Settings menu.

Step 3 — Model Storage Location

Models are stored in a default location on your Mac. If you prefer a different folder, you can choose a custom location during this step.

Completing Setup

When both permissions are granted and at least one model is downloaded, the setup window displays the message: "Setup complete! Speak2 is ready." Tap Close to finish setup.

The Menu Bar Icon

After setup, Speak2 appears as a microphone icon in the macOS menu bar. The icon changes color to indicate the current state:

Icon Color	State
Gray	Ready — Speak2 is idle and waiting for input.
Red	Recording — Speak2 is capturing audio from your microphone.
Cyan	Transcribing — The speech model is converting audio to text.
Purple	Refining — AI is cleaning and improving the transcribed text.
Yellow (animated)	Loading — Speak2 is loading a speech model into memory.

Click the menu bar icon to open the status menu. From this menu, you can select a speech model, configure your hotkey, access the quick dictionary, and open settings.

Your Privacy Is Protected

Speak2 runs entirely on your Mac. Your voice recordings are processed locally and are never sent to any server. This ensures complete privacy for all dictation.

2. Dictation

How Dictation Works

Speak2 converts your spoken words into written text. The process is simple:

Place your cursor in any text field in any application.
Press the hotkey to start recording.
Speak naturally into your microphone.
Stop recording by releasing the hotkey (Hold Mode) or pressing it twice again (Toggle Mode).
Speak2 transcribes your speech and pastes the text at your cursor position.

Speak2 works with any application that accepts text input — email clients, word processors, web browsers, messaging apps, and more.

Hold Mode

Hold Mode is the default recording method. Press and hold the hotkey to record. Release the hotkey to stop recording and begin transcription.

This is the simplest method. Hold the key while you speak. Release the key when you are finished.

Hold Mode Summary

Start recording: Press and hold the hotkey.
Stop recording: Release the hotkey.
No audio feedback is played. The menu bar icon color indicates the current state.

Toggle Mode

Toggle Mode uses a double-press to start and stop recording. Press the hotkey twice quickly (within 0.4 seconds) to start recording. Press the hotkey twice again to stop recording and begin transcription.

Toggle Mode plays audio feedback sounds. You will hear a start tone when recording begins and a stop tone when recording ends. These tones confirm that Speak2 received your input.

This mode is useful when you want to dictate longer passages without holding a key the entire time.

Toggle Mode Summary

Start recording: Press the hotkey twice quickly.
Stop recording: Press the hotkey twice quickly again.
Audio feedback tones confirm the start and stop of each recording.

The Dictation Flow

When you dictate, Speak2 performs several steps in sequence:

You press the hotkey — Recording begins. The menu bar icon turns red.
You speak into your microphone — Speak2 captures your audio.
You release or double-press the hotkey — Recording stops.
Transcription — Speak2 sends the audio to your selected speech model for transcription. The menu bar icon turns cyan.
Personal dictionary — If you have configured personal dictionary replacements, they are applied to the transcribed text.
AI refinement — If AI refinement is enabled, Speak2 removes filler words and verbal noise from the text. The menu bar icon turns purple during this step.
Text is pasted — The final text is pasted at your cursor position.
Clipboard is restored — Speak2 temporarily uses your system clipboard to paste the text. After pasting, it restores your original clipboard contents. Your previous clipboard data is not lost.

Status Indicators

The Speak2 menu bar icon changes color to show the current state:

Color	State
Gray	Idle — Speak2 is ready for input
Red	Recording — Speak2 is capturing audio from your microphone
Cyan	Transcribing — The speech model is converting audio to text
Purple	Refining — AI is cleaning and improving the transcribed text

When the icon returns to gray, dictation is complete and the text has been pasted.

Accessibility Permission Required

Speak2 requires Accessibility permission to detect hotkeys globally and to paste text into other applications. If text is not appearing after dictation, verify that Accessibility permission is granted in System Settings → Privacy & Security → Accessibility.

Speak Naturally

Speak naturally and in complete sentences. The transcription models produce better results with longer, natural speech than with individual words.

3. Live Transcription

Overview

Speak2 can display a floating overlay panel that shows your words as they are transcribed in real time. This panel lets you see what Speak2 is hearing while you speak. The text updates continuously as you talk, so you can confirm that your words are being captured correctly.

Enabling Live Transcription

Open Speak2 and click the menu bar icon.
Go to Settings and then select General.
Find the option labeled "Show live transcription while recording."
Turn this option on.

Live transcription is now enabled. The overlay panel will appear each time you begin a recording.

How It Works

When you start recording, a transparent panel appears near the bottom center of your screen, just above the Dock. This panel displays the following:

Recording indicator — A pulsing red dot on the left side of the panel. This dot confirms that Speak2 is actively recording.
Confirmed text — Words that have been recognized with high confidence. These words appear in regular weight.
Unconfirmed text — Words that are still being processed by the speech engine. These words appear in italic and in a lighter color. As processing completes, unconfirmed words become confirmed.
"Listening..." — If no words have been recognized yet, the panel displays this message. It means Speak2 is active and waiting for speech input.

The panel resizes automatically as more text is transcribed. The maximum width of the panel is 80% of your screen width or 700 pixels, whichever is smaller.

After Recording

When you stop recording, the live transcription panel disappears. Speak2 then performs a final full transcription of the complete audio. This final pass analyzes the entire recording at once, which provides better context and higher accuracy.

Because of this, the final transcription result may differ slightly from what was displayed in the live panel. Small corrections to word choice, punctuation, or phrasing are normal. The final result is the most accurate version of your transcription.

Non-Intrusive Overlay

The live transcription panel does not interfere with your other applications. It appears as a non-interactive overlay and does not steal focus from the application where you are typing.

4. Speech Models

Overview

Speak2 uses on-device AI models to convert your speech into text. All processing happens locally on your Mac. No audio data is sent to the internet. You can download multiple models and switch between them at any time.

Available Models

Speak2 includes two families of speech recognition models. Each model offers a different balance of size, speed, and accuracy.

Whisper Models (based on OpenAI Whisper):

Model	Size	Language	Description
tiny.en	~75 MB	English only	Smallest and fastest. Good for quick notes.
base.en	~140 MB	English only	Recommended balance of speed and accuracy.
small.en	~460 MB	English only	Better accuracy for complex speech.
large-v3	~3 GB	Multilingual	Best accuracy. Supports many languages.
large-v3-turbo	~954 MB	Multilingual	Fast and accurate. Good multilingual option.

Parakeet Model:

Model	Size	Language	Description
Parakeet v3	~600 MB	25 languages	Best option for multilingual users.

4.1 Downloading a Model

Open Settings → Models. If this is your first time using Speak2, the Setup Wizard will guide you through this step.
Click the Download button next to the model you want.
A progress bar shows the download status.
After the download completes, the model loads automatically.

4.2 Switching Models

Click the Speak2 icon in the menu bar.
Open the Model submenu.
Select the model you want to use.
A checkmark appears next to the active model.

Models that have not been downloaded show a download arrow (↓) icon.

4.3 Deleting a Model

Open Settings → Models.
Click the trash icon next to a downloaded model to remove it from your Mac and free disk space.

4.4 Storage Location

By default, models are stored in a standard application support location. To change the storage location:

Open Settings → Models.
Click Choose Folder...
Select a new folder on your Mac.
You will be asked whether to move existing models to the new location or start fresh.

Which Model Should You Choose?

If you primarily dictate in English, start with base.en for the best balance of speed and accuracy. If you need to dictate in multiple languages, choose Parakeet v3 or one of the Whisper multilingual models (large-v3 or large-v3-turbo).

Performance on Low-Memory Systems

Larger models require more memory and take longer to process. On Macs with 8 GB of RAM, the large-v3 model may be slow. The base.en or small.en models are recommended for these systems.

5. Personal Dictionary

Overview

The Personal Dictionary teaches Speak2 to recognize words it frequently gets wrong. Names, technical terms, brand names, and specialized vocabulary are common examples. When Speak2 transcribes your speech, it automatically replaces misheard words with the correct ones from your dictionary.

How It Works

The Personal Dictionary uses a two-stage replacement process to correct transcription errors.

Stage 1 — Direct Alias Matching

If you define aliases (common misheard versions of a word), Speak2 replaces them with the correct word. This matching is exact and case-insensitive. For example, if "Antropik" is an alias for "Anthropic," any occurrence of "Antropik" in the transcription is replaced with "Anthropic."

Stage 2 — Phonetic Matching

For words without an exact alias match, Speak2 uses phonetic algorithms to detect words that sound similar to your dictionary entries. If a word in the transcription sounds like a dictionary entry, Speak2 corrects it automatically. This catches misspellings and variations that aliases do not cover.

Adding a Word

There are two ways to add a new word:

Open Settings → Dictionary and click the + button.
Click the menu bar icon and select Add Word...

Both methods open the entry form. Fill in the following fields:

Word (required) — The correct spelling of the word. This is the text that Speak2 will use as the replacement.
Aliases (optional) — Common misheard versions of the word, separated by commas. For example, for the word "Anthropic" you might add aliases "Antropik, Ann Tropic."
Pronunciation Hint (optional) — How the word sounds phonetically. This helps the phonetic matching engine identify the word more accurately.
Category — Organize entries by type. Choose from Names, Technical, Brands, Medical, Legal, or Custom.
Language — The language this entry applies to. Speak2 supports 25 languages for dictionary entries.

Dictionary Categories

Names — People's names

Technical — Technical terms and jargon

Brands — Brand and product names

Medical — Medical terminology

Legal — Legal terminology

Custom — Any other specialized words

Managing Entries

Enable or Disable — Click the circle icon next to an entry to enable or disable it. Disabled entries remain in your dictionary but are not used during transcription. This is useful for temporary changes without deleting the entry.
Edit — Hover over an entry and click the pencil icon to modify any field.
Delete — Hover over an entry and click the trash icon to remove it permanently.
Filter — Use the language dropdown and category menu to filter entries. Use the search bar to find specific words.

Vocabulary Biasing

For Whisper models, your dictionary words are sent as vocabulary hints during transcription. This improves the model's ability to recognize these words correctly the first time, before any post-processing replacement occurs. The result is more accurate initial transcription for your specialized vocabulary.

Parakeet models do not support vocabulary biasing. Parakeet uses post-transcription replacement only. Dictionary entries still work with Parakeet, but corrections are applied after the transcription is complete.

Import and Export

Click the more menu (three dots) in the Dictionary settings to access Import and Export options. Dictionary data is saved as JSON files. Use this feature to:

Back up your dictionary.
Transfer your dictionary to another Mac.
Share specialized vocabulary with colleagues or team members.

Getting Started with Your Dictionary

Start by adding the names of people, places, and products you mention frequently. These are the words most likely to be misheard by speech recognition.

6. AI Text Refinement

Overview

After Speak2 transcribes your speech, it can send the text through an AI language model. This step is optional. The AI removes filler words such as "um," "uh," and "like." It also removes false starts, repetitions, and verbal noise. The result is clean, polished text that reads as if it were typed.

Three Modes

Go to Settings → AI Refine to choose a mode.

Off — No AI processing. The raw transcription, with any dictionary replacements applied, is pasted directly.
Built-in (Recommended) — Uses a small language model (Qwen 2.5 1.5B) that runs locally on your Mac. Requires a one-time download of approximately 1.1 GB. All processing happens on your device. No data is sent to the internet.
External Server (Ollama) — Sends text to an Ollama server running on your local network or on the same machine. Requires Ollama to be installed and running separately.

Built-in Mode Setup

Open Settings → AI Refine.
Select Built-in.
Click Download Model to download the refinement model (approximately 1.1 GB).
Once the download is complete, the status shows Ready with a checkmark.
No further configuration is needed. The model runs automatically after each transcription.

Ollama Mode Setup

Install Ollama on your Mac or on another computer on your network (visit ollama.com for installation instructions).
Pull a model in Ollama. For example, run this command in Terminal: ollama pull gemma3:4b
In Speak2, open Settings → AI Refine and select External Server.
Enter the Server URL. The default value is http://localhost:11434.
Enter the Model Name. The default value is gemma3:4b. The model must already be pulled in Ollama.
Click Test Connection to verify the setup. A success message confirms the connection is working.

Custom Refinement Prompt

Both Built-in and Ollama modes accept a custom prompt. The prompt is added before your transcription text when it is sent to the AI model.

To use the default prompt, leave the prompt field empty. The default prompt instructs the AI to remove filler words, false starts, repetitions, and verbal noise, and return only the final intended message as plain text.
To create a custom prompt, type your instructions in the prompt field.
To restore the original prompt, click Reset to Default.

Status Indicator

When AI refinement is active, the menu bar icon turns purple and displays a sparkles symbol. This indicates that the AI is processing your text. When refinement is complete, the cleaned text is pasted automatically.

Recommended for Most Users

The built-in mode is recommended for most users. It requires no additional software, runs entirely on your Mac, and produces good results for cleaning up dictated text.

Processing Time

AI refinement adds processing time after each transcription. If you need the fastest possible dictation speed, set AI Refine to Off and rely on the raw transcription with dictionary corrections only.

7. Transcription History

Overview

Speak2 keeps a record of your recent transcriptions. You can review, search, copy, and export your transcription history at any time.

7.1 Viewing History

Open Settings → History to view your transcription history. Entries are grouped by time period:

Today
Yesterday
Last 7 Days
Last 30 Days
Older

Each entry shows the following information:

Transcribed text — Long entries are truncated to 3 lines. Click Show More to view the full text.
Metadata — The timestamp, the model used, and the language.
Copy button — Copies the text to your clipboard.
Delete button — Removes the entry from your history.

7.2 Searching and Filtering

Use the search bar at the top of the History screen to find entries that contain specific text. Results update as you type.

Use the model filter menu to show only entries from a specific transcription model. For example, you can filter to show only entries transcribed with the Whisper base.en model.

7.3 Copying Text

Click the copy button (document icon) next to any entry to copy its text to your clipboard. The button changes to a checkmark for 1.5 seconds to confirm the copy was successful.

7.4 Exporting History

Click the more menu (three dots) at the top of the History screen.
Select Export... from the menu.
Choose a location and save the file.

Your entire history is saved as a JSON file. This is useful for record-keeping or for transferring your history to another machine.

7.5 Clearing History

Click Clear All to delete all history entries. A confirmation dialog appears before deletion proceeds. This action cannot be undone.

7.6 Storage Limits

Speak2 stores up to 500 transcription entries. When the limit is reached, the oldest entries are removed automatically to make room for new ones.

Individual entries are limited to 10,000 characters. Text beyond this limit is not stored.

Recover Text from Previous Sessions

Use the transcription history to review and recover text from previous dictation sessions. If you accidentally close an application before saving, your dictated text is still available in the history.

8. Hotkey Configuration

Overview

Speak2 uses a global hotkey to start and stop dictation. The hotkey works in any application, regardless of which window is focused. You can choose from preset hotkeys or create custom key combinations.

Preset Hotkeys

The following preset hotkeys are available in Settings → General, under Trigger Hotkey:

Hotkey	Description
Fn	The Function key. Simple and easy to reach.
Right Option	The Option (Alt) key on the right side of the keyboard.
Right Command	The Command key on the right side of the keyboard.
Hyper Key	Ctrl + Option + Command + Shift pressed together. Unlikely to conflict with other shortcuts.
Ctrl + Option + Space	A three-key combination using the Space bar.

Custom Hotkey Combos

If none of the presets suit your workflow, you can create a custom combination:

Open Settings → General.
Click Add Custom Combo...
A key capture field appears with the instruction "Press a key or combo (Esc to cancel)..."
Press the key or key combination you want to use.
The custom combo is saved and appears in your hotkey list.

To remove a custom combo, click the minus icon next to it.

Hold vs. Toggle Mode

Speak2 offers two modes for controlling recording with your hotkey:

Hold Mode (default) — Hold the hotkey to record. Release to stop and transcribe.
Toggle Mode — Press the hotkey twice quickly to start recording. Press twice again to stop and transcribe. The double-press window is 0.4 seconds.

To enable Toggle Mode, turn on "Press twice (toggle)" in Settings → General.

You can also switch between modes from the menu bar icon's Hotkey submenu. Each hotkey shows its current mode as a suffix: "(hold)" or "(press twice)."

Changing the Hotkey from the Menu Bar

Click the Speak2 menu bar icon.
Open the Hotkey submenu.
Select any preset or custom hotkey.

The change takes effect immediately.

Recommended Hotkey

The Fn key is the simplest hotkey for most users. It is easy to reach and rarely conflicts with other application shortcuts. If you use a keyboard without an Fn key, try Right Option or a custom combination.

Hotkey Conflicts

Some hotkeys may conflict with system shortcuts or other applications. If your hotkey does not work as expected, try a different preset or create a custom combination that does not overlap with existing shortcuts.

9. Troubleshooting

This section lists common problems and their solutions. If your issue is not listed here, contact support using the information at the end of this section.

9.1 Hotkey Not Working

Problem: You press the hotkey, but Speak2 does not respond.

Solutions:

Verify that Accessibility permission is granted. Open System Settings → Privacy & Security → Accessibility. Speak2 must appear in the list with its toggle turned on.
If you recently updated macOS, you may need to remove Speak2 from the Accessibility list and add it again. macOS sometimes resets permissions after a system update.
Try selecting a different hotkey preset. Another application may be using the same key combination.
Restart Speak2 after granting or re-granting permissions.

9.2 Text Not Appearing After Dictation

Problem: Speak2 transcribes your speech (the icon cycles through colors), but no text appears in your application.

Solutions:

Verify that Accessibility permission is granted. Speak2 requires this permission to paste text into other applications.
Make sure your cursor is placed inside a text field before you begin dictating.
Some applications use restricted text fields that do not accept programmatic paste. Try dictating into TextEdit to confirm Speak2 is working. If text appears in TextEdit but not in another application, the issue is with that application's text field.
Check that a speech model is downloaded and loaded. The menu bar status should show "Ready."

9.3 Poor Transcription Accuracy

Problem: The transcribed text contains many errors or incorrect words.

Solutions:

Try a larger model. Larger models such as small.en, large-v3, or large-v3-turbo produce more accurate results than tiny.en or base.en.
Speak clearly and at a natural pace. Speaking too quickly or too quietly can reduce accuracy.
Reduce background noise. Use a directional microphone if possible.
Add frequently misheard words to the Personal Dictionary with aliases for common mistakes.
Enable AI Refinement to clean up filler words and false starts automatically.

9.4 Model Download Fails

Problem: A model download does not complete or produces an error.

Solutions:

Check your internet connection. Model downloads require a stable connection.
Verify that you have sufficient free disk space. The largest model (large-v3) requires approximately 3 GB.
Try downloading a smaller model first to confirm the download process works.
If a download stops partway through, try again. Downloads resume from where they stopped.

9.5 AI Refinement Not Working (Built-in)

Problem: The built-in AI refinement does not produce results or is very slow.

Solutions:

Verify that the refinement model has been downloaded in Settings → AI Refine.
Make sure your Mac has sufficient available memory. The built-in model uses approximately 1–2 GB of RAM.
If refinement is slow on first use, this is expected. The model needs time to initialize. Subsequent refinements will be faster.

9.6 AI Refinement Not Working (Ollama)

Problem: Ollama refinement fails or produces an error.

Solutions:

Verify that Ollama is running on the specified server (default: http://localhost:11434).
Verify that the model name in Speak2 matches a model you have pulled in Ollama. Run ollama list in Terminal to check.
Use the Test Connection button in Settings → AI Refine to diagnose the issue.
Check that your firewall is not blocking the connection.
The default timeout is 30 seconds. Very large transcriptions may require more time.

9.7 Microphone Not Detected

Problem: Speak2 does not detect your microphone or recording produces no audio.

Solutions:

Verify that Microphone permission is granted in System Settings → Privacy & Security → Microphone.
Check that your microphone is selected as the input device in System Settings → Sound → Input.
If you connected a new microphone while Speak2 was running, restart Speak2.

9.8 High Memory Usage

Problem: Speak2 uses a large amount of memory (RAM).

Solutions:

Larger speech models use more memory. Switch to a smaller model (tiny.en or base.en) if your Mac has limited RAM.
If both a speech model and the AI refinement model are loaded simultaneously, memory usage will be higher. Disable AI refinement if memory is a concern.
Close and reopen Speak2 to release memory held by models that are no longer in use.

9.9 Contact Support

If the solutions above do not resolve your issue, contact the Speak2 support team.

Email: support@moonquakemedia.com
Include the following information:
- Your macOS version
- Your Mac model
- A clear description of the problem
Screenshots or screen recordings are helpful.

Table of Contents

1. Getting Started

Welcome

System Requirements

Apple Silicon Required

Setup Wizard

The Menu Bar Icon

Your Privacy Is Protected

2. Dictation

How Dictation Works

Hold Mode

Hold Mode Summary

Toggle Mode

Toggle Mode Summary

The Dictation Flow

Status Indicators

Accessibility Permission Required

Speak Naturally

3. Live Transcription

Overview

Enabling Live Transcription

How It Works

After Recording

Non-Intrusive Overlay

4. Speech Models

Overview

Available Models

4.1 Downloading a Model

4.2 Switching Models

4.3 Deleting a Model

4.4 Storage Location

Which Model Should You Choose?

Performance on Low-Memory Systems

5. Personal Dictionary

Overview

How It Works

Adding a Word

Dictionary Categories

Managing Entries

Vocabulary Biasing

Import and Export

Getting Started with Your Dictionary

6. AI Text Refinement

Overview

Three Modes

Built-in Mode Setup

Ollama Mode Setup

Custom Refinement Prompt

Status Indicator

Recommended for Most Users

Processing Time

7. Transcription History

Overview

7.1 Viewing History

7.2 Searching and Filtering

7.3 Copying Text

7.4 Exporting History

7.5 Clearing History

7.6 Storage Limits

Recover Text from Previous Sessions

8. Hotkey Configuration

Overview

Preset Hotkeys

Custom Hotkey Combos

Hold vs. Toggle Mode

Changing the Hotkey from the Menu Bar

Recommended Hotkey

Hotkey Conflicts

9. Troubleshooting

9.1 Hotkey Not Working

9.2 Text Not Appearing After Dictation

9.3 Poor Transcription Accuracy

9.4 Model Download Fails

9.5 AI Refinement Not Working (Built-in)

9.6 AI Refinement Not Working (Ollama)

9.7 Microphone Not Detected

9.8 High Memory Usage

9.9 Contact Support