30days
Default Zoom audio retention
4
Transcription methods on Mac
~5%
WER on clean speech
$0
Local-only option
AD
Andrew Dyuzhov
CEO & Solo Founder, MetaWhisp · @hypersonq
Most guides on Zoom transcription assume you want it the easy way: turn on cloud recording, get a transcript by email. That works. It also stores your meeting audio on Zoom's servers for 30 days, and the transcript can be reviewed by their employees for "service improvement" unless you have a Business Associate Agreement. For half my client calls, that's fine. For the other half — under NDA, including financial data or strategic conversations — it's not. I needed an on-device path. I built one. Then I built MetaWhisp partly because I kept hitting the same problem. This guide is the full map of Zoom transcription on Mac in 2026. Four methods, ranked by privacy posture and accuracy. The right choice depends on what's in the meeting.
TL;DR for busy hosts:
  • Built-in cloud transcription works fine for most meetings — fast, accurate, free on paid plans. Audio sits on Zoom's servers 30 days.
  • Zoom AI Companion adds summaries and action items, also cloud-processed.
  • Local recording + on-device transcription is the private path: zero data leaves your Mac. Adds ~3 minutes of post-meeting work.
  • Third-party tools (Otter, Fathom, Fireflies) add speaker diarization but bring back cloud privacy questions.
  • For NDA, medical, legal meetings: switch Zoom Settings → Recording to Local, transcribe with MetaWhisp or Whisper Transcription. Audit with Little Snitch.
---

What Zoom offers natively in 2026

Zoom shipped three transcription products. All three process audio on Zoom's infrastructure.

Live captions

Free on every Zoom plan since 2022. Click CC in any meeting, choose Show captions. Real-time speech-to-text appears at the bottom of the call. Useful for accessibility, multilingual teams, and noisy environments. Live captions are processed live on Zoom's servers — no storage by default unless you also enable recording. Latency is typically 1-2 seconds. Accuracy is decent on clean audio in supported languages. Off-script speech, heavy accents, and multi-speaker overlap reduce quality.

Audio transcription of cloud recordings

When you record a meeting to Zoom's cloud, Zoom can also generate a written transcript. Available on paid plans (Pro and higher). The transcript shows up alongside the recording in your Zoom web dashboard, usually within 30 minutes after the meeting ends. Behind the scenes: cloud recording uploads the meeting to Zoom's storage. Their backend transcription service processes the audio, returns a VTT file (subtitles format) and a TXT file (plain text), and stores both alongside the recording. This is the "default" transcription most users encounter. It's good enough for internal team meetings, training videos, and casual recordings.

Zoom AI Companion

Released 2023, expanded heavily in 2024-2025. AI Companion goes beyond raw transcription — it produces a structured meeting summary, identifies action items, breaks the recording into chapters, and lets you query the recording in plain English ("What did we decide about pricing?"). Available on Pro plans and higher (often included at no extra cost in 2026). Per Zoom's documentation, AI Companion data is processed on Zoom's infrastructure and does not contribute to model training without explicit admin opt-in. For team meetings and sales calls where you want notes plus action items, AI Companion saves real time. For meetings where you don't want any AI to summarize the content, you can disable it per-meeting. ---

Method 1: Zoom cloud transcription (built-in)

How it works

  1. Sign in to zoom.us in a browser.
  2. Settings → Recording → enable Cloud recording.
  3. Within Cloud recording settings, enable Audio transcript.
  4. Schedule and host a meeting normally. Click Record → Record to the Cloud.
  5. End the meeting. Zoom processes for 5-30 minutes. You receive an email with the recording link and transcript file.

What you get

Strengths

Weaknesses

When to use it

Internal team meetings. All-hands. Training videos. Anything where the content is internal, low-sensitivity, and you want zero friction. ---

Method 2: Zoom AI Companion

How it works

Pre-meeting: in your Zoom account settings, enable AI Companion (admin level on business plans, account level on Pro). Per-meeting: as host, click the AI Companion icon during the call to start summarization. Or set it to auto-start on every meeting. After the meeting, AI Companion produces a structured summary that includes:

What it's good for

Sales calls. Customer success interviews. Cross-functional planning meetings. Any meeting where you'd write a summary email afterward — AI Companion drafts that for you in 30 seconds.

Privacy posture

Per Zoom's privacy documentation, AI Companion processes audio on Zoom's infrastructure. Data is not used to train Zoom's AI models without explicit account-admin opt-in. However, audio and transcripts still flow through Zoom's pipeline during processing — the same posture as cloud recording transcription. For sensitive meetings (legal, medical, financial, HR), AI Companion adds value but doesn't change the underlying privacy reality: your audio is processed in Zoom's cloud.

Cost

Included with Pro plans and higher in most regions in 2026. Zero incremental cost beyond your Zoom subscription. Compared to standalone meeting AI tools (Otter Pro $17/mo, Fathom $19/mo, Fireflies $20/mo), AI Companion is essentially free if you already pay for Zoom Pro. ---

Method 3: Local recording + on-device transcription (private)

This is the private path. The audio never leaves your Mac.

The architecture

┌─────────────────────────────────────────────────────────┐
│   YOUR MAC                                              │
│                                                         │
│   ┌──────────┐   ┌────────────┐   ┌─────────────┐       │
│   │ Zoom     │──▶│  M4A file  │──▶│  Whisper    │       │
│   │ (local   │   │  saved to  │   │  large-v3   │       │
│   │ record)  │   │  ~/Zoom/   │   │  on ANE     │       │
│   └──────────┘   └────────────┘   └─────────────┘       │
│        │                                  │             │
│        └─ network: only Zoom call       │              │
│                                            ▼             │
│                                  ┌─────────────┐       │
│                                  │ Plain text  │       │
│                                  │ transcript  │       │
│                                  └─────────────┘       │
│                                                         │
│   ┌─────────── NETWORK BOUNDARY ─────────────┐        │
│   │  ❌ no transcript / audio egress    │        │
│   └────────────────────────────────────────────┘        │
└─────────────────────────────────────────────────────────┘
The Zoom call itself uses the network (you can't eliminate that). But the recording and transcription stay local.

Setup

1

Switch Zoom from Cloud to Local recording

In Zoom Settings → Recording: disable Cloud recording (if enabled), enable Local recording. Choose a folder for saved recordings (default: ~/Documents/Zoom).

2

Record the meeting locally

As host, in any meeting click Record → Record on this Computer. Zoom captures both video (.mp4) and audio (.m4a) to your Mac. End the meeting; Zoom processes the file in 1-3 minutes for a 30-minute call.

3

Find the audio-only file

In your Zoom recordings folder, look for audio_only_*.m4a. This is the cleaner input for transcription — no video processing overhead.

4

Transcribe locally

Drag the M4A into MetaWhisp for batch file transcription, or Whisper Transcription by Good Snooze. Both use Whisper large-v3-turbo on Apple Neural Engine. A 30-minute meeting transcribes in 30-90 seconds.

5

(Optional) Audit with Little Snitch

If you need compliance proof: install Little Snitch, run a transcription, observe network activity. You'll see zero outbound connections from the transcription tool. This is what HIPAA / NDA compliance teams want to see.

Strengths

Weaknesses

When to use it

Client calls under NDA. Medical or legal consultations. HR conversations. Therapy or counseling sessions. Any meeting where the conversation is sensitive enough that "Zoom's privacy policy" isn't sufficient assurance. ---

Method 4: Third-party meeting tools

Several products integrate with Zoom and add features. They share one common cost: your audio passes through their servers too.

Otter.ai

Joins Zoom meetings as a virtual participant (the now-familiar "Otter.ai has joined" message). Provides real-time transcription, speaker labels (who said what), and a searchable archive. Free tier offers 300 minutes/month. Pro is $16.99/month. Strengths: speaker diarization is solid, searchable cross-meeting transcripts, mobile-friendly. Weaknesses: cloud-based, the bot's presence changes meeting dynamics, audio retained on Otter's servers.

Fireflies.ai

Similar approach to Otter — bot joins meeting, transcribes, summarizes. $19/user/month for Pro. Integrates with CRMs (Salesforce, HubSpot).

Fathom

Free for individual users (paid for teams). Records and summarizes Zoom meetings without joining as a separate bot — uses local app integration. Fast adoption among sales teams.

Read AI

Focuses on sentiment analysis and engagement metrics during meetings, in addition to transcription. Useful for managers analyzing meeting effectiveness.

Common privacy issue

All of these tools require uploading audio (or letting their bot capture it) to their cloud. Each has its own privacy policy, retention period, and data-handling pipeline. None are HIPAA-compatible by default — most offer BAAs as paid add-ons. For meetings where speaker diarization or AI features matter more than maximum privacy, these tools fill a real need. For maximum privacy, they don't. ---

The privacy reality of Zoom transcription

Let me show what actually happens to your audio in each method.
MethodWhere audio goesRetentionHuman review?
Zoom cloud transcriptionZoom servers (US/EU based on region)30 days default; up to years on paid retentionPossible per privacy policy
Zoom AI CompanionZoom servers + AI processing pipelineUntil you deletePossible per privacy policy
Local recording + Whisper localYour Mac onlyYou decideImpossible
Otter.aiOtter serversUntil you deleteSampled per ToS
Fireflies.aiFireflies serversUntil you deleteSampled per ToS
FathomFathom serversUntil you deletePer ToS

What "human review" actually means

Most cloud meeting tools state in their privacy policies that recordings may be reviewed for service improvement, abuse detection, model improvement, or legal compliance. In practice, this is usually a small sample, often anonymized, often just for QA. But it does happen. For a casual team meeting, none of this matters. For a doctor-patient consultation, a lawyer-client privileged conversation, an HR investigation, or a board meeting discussing M&A — it matters a lot.

HIPAA, NDA, and compliance

Zoom offers a HIPAA Business Associate Agreement (BAA) as a paid add-on for healthcare organizations. With a BAA, Zoom commits to specific data-handling practices that align with HIPAA requirements. Without a BAA, recording PHI with Zoom is a HIPAA violation. For NDA-sensitive business meetings, contractual terms vary. The safest path: don't generate a third-party copy of the audio at all. Use local recording + on-device transcription. The legal exposure is limited to your own device.

The on-device alternative for compliance

When transcription happens entirely on your Mac: This is the cleanest legal posture available for voice transcription in 2026. See our deep-dive on private voice-to-text for Mac for the full framework. ---

All four methods compared

Criteria Zoom cloud AI Companion Local + Whisper Otter / Fireflies
Setup time2 min5 min10 min10 min
Per-meeting effort003 min0
PrivacyCloudCloudLocal-onlyCloud
HIPAA without BAANoNoYes (architecture)No
Speaker labelsLimitedYesLimitedYes
Action items / summaryNoYesNo (or via post)Yes
Search across meetingsYesYesManualYes
Offline capableNoNoYes (after model d/l)No
Cost (per user/year)$180+ for Pro planIncluded Pro+$0-30$200-240
Languages supported30+10+30+ (Whisper)3-5

Decision matrix

---

How to set up private Zoom transcription on Mac

The complete recipe for someone who wants the on-device path. Time to first transcript: ~15 minutes including downloads.
1

Update Zoom settings

Open Zoom on Mac. Settings (⌘,) → Recording. Disable Cloud recording. Enable Local recording. Set "Store my recordings at" to ~/Documents/Zoom or another folder.

2

Choose your transcription tool

Two solid options on Mac:

  • MetaWhisp — free, real-time dictation (system-wide), and now batch file transcription. 30+ languages. On-device by default. Optional cloud at $30/year.
  • Whisper Transcription by Good Snooze — paid one-time, dedicated file-based UI, drag-and-drop batch processing.

For mixing real-time dictation with file transcription, MetaWhisp covers both. For pure file batch work, Whisper Transcription is more focused.

3

Wait for the model download

Both tools download Whisper large-v3-turbo on first run (~1.5 GB). One-time download over the internet. After that, all transcription is fully offline.

4

Test on a short recording

Record a 2-minute test meeting (or use any existing M4A file). Drag into your tool of choice. Verify: transcript appears within ~10 seconds, accuracy looks reasonable (4-7% errors on clean speech), no network activity during transcription (Little Snitch test if you want proof).

5

Build a workflow

For most users: end meeting → wait 1-3 min for Zoom local processing → drag M4A into transcription tool → transcript ready. Total time vs. cloud auto-transcription: ~3 extra minutes per meeting. Worth it for sensitive content.

6

Decide retention

You now have local audio (M4A) and transcript (TXT or DOCX). Decide your retention policy: encrypt with FileVault (default on macOS), keep for X days, delete after, etc. This is a real decision your previous setup made for you (Zoom defaulted to 30 days).

7

(Optional) Document compliance posture

If you work in regulated industries: write a one-page memo describing your transcription pipeline. "Audio recorded locally, transcribed via Whisper large-v3-turbo running on Apple Neural Engine, no third-party data processing." Provide network-activity logs if your compliance officer asks. This conversation is much easier with on-device tools than with cloud tools requiring BAAs.

---

Real-world workflows

These are composite profiles drawn from MetaWhisp users and conversations with people who chose specific Zoom transcription approaches. Names changed.
A
Alex — Solo Consultant
5-8 client calls/week · NDAs on most

Alex bills $250/hour. Most clients sign NDAs. He used to enable Zoom cloud transcription "for convenience" until a client asked if his transcripts were stored anywhere. He couldn't answer with confidence.

His current flow: Local recording on every client call. After the meeting, drags M4A into MetaWhisp. Reviews transcript, copies key quotes into his client doc, deletes the M4A within 24 hours. Total post-meeting work: 5 minutes.

What changed: Compliance conversations with new clients are dramatically simpler. "I record locally, transcribe locally, delete in 24h" beats "Zoom retains it 30 days, here's a BAA."

M
Maya — Therapist in Private Practice
HIPAA-bound · Telehealth on Zoom

Maya does telehealth therapy via Zoom. She originally avoided ALL recording because she didn't want PHI on Zoom's servers, even with a BAA. But she missed having session notes for review.

Her flow: Zoom Healthcare Plan with BAA for the call itself. Local recording only with patient consent (rare). When recording, transcribes immediately on Mac with MetaWhisp, generates her session notes, deletes the audio. Audio never goes to any third party.

Compliance posture: Documented. Audited by her practice's compliance lead. Approved.

D
Dani — VP Sales at SaaS Startup
15-20 calls/week · All discovery / closing

Dani uses cloud transcription extensively. Her sales team needs speaker diarization, action items, and integration with HubSpot. Privacy is not the binding constraint — sales velocity is.

Her stack: Zoom Pro + AI Companion for internal recordings. Otter.ai for external customer calls (better speaker labels, CRM integration). Both routed through their respective clouds.

Why not local: "We're moving too fast for the manual workflow. We accept the privacy tradeoff because we're discussing software pricing, not health records."

B
Boris — Investigative Journalist
Source interviews · Some hostile environments

Boris interviews sources on sensitive topics, sometimes from countries with hostile press environments. Cloud-based recording is a non-starter — anything that touches a third-party server creates discovery exposure.

His flow: Local recording always. Transcription on his offline-capable Mac after each interview. Audio encrypted with FileVault and isolated to a separate user account on his Mac. Source identities never touch any cloud service.

Why this matters: "The moment audio touches a server, that server's jurisdiction becomes part of my source's exposure. On-device removes that variable."

---

Frequently asked questions

Can Zoom transcribe meetings automatically? Yes. Three native features: live captions during calls (free on all plans), audio transcription of cloud recordings (paid plans with cloud recording), and Zoom AI Companion (Pro+ plans). All three process audio on Zoom's servers and store transcripts for 30 days by default.
How do I enable Zoom transcription on Mac? Sign in to zoom.us → Settings → Recording → enable Cloud recording → enable Audio transcript. Your next cloud-recorded meeting will have a transcript emailed to you within ~30 minutes. For live captions: in any meeting click CC → Show captions.
Is Zoom transcription private? Not by default. Zoom processes audio on their servers, stores recordings and transcripts for 30 days (longer with paid retention), and per their privacy policy, transcripts may be reviewed by Zoom employees for service improvement. For NDA, medical, legal, or HR-sensitive meetings, on-device transcription is the only fully private option.
How do I transcribe a Zoom meeting privately on Mac? Switch Zoom Settings → Recording from Cloud to Local. Record meetings to your Mac. Transcribe the saved M4A file with an on-device tool: MetaWhisp (free, system-wide voice-to-text plus file batch) or Whisper Transcription (paid, dedicated file UI). Audio never leaves your Mac.
Does Zoom AI Companion store my data? Yes. AI Companion processes audio on Zoom's infrastructure and stores meeting summaries in your Zoom account. Per Zoom's documentation, AI Companion data is not used to train Zoom's AI models without explicit admin opt-in, but audio and transcripts still flow through Zoom's pipeline.
What's the most accurate Zoom transcription on Mac? For clean English speech, all major tools achieve 4-7% word error rate. Zoom native and AI Companion are highly accurate on clear audio. On-device tools using Whisper large-v3-turbo (MetaWhisp, Whisper Transcription) match cloud accuracy after a one-time 1.5 GB model download. Otter.ai and Fathom add speaker diarization which is harder to do on-device.
Can I transcribe an old Zoom meeting recording on Mac? Yes. Download the recording (zoom.us → Recordings → Download) or use a local one. Drop the MP4 or M4A file into a transcription tool. MetaWhisp processes locally; Whisper Transcription handles batch files; Otter or Rev for cloud-based with speaker labels. For privacy, use local. For speaker labels, use cloud.
How much does Zoom transcription cost? Basic transcription is included with paid Zoom plans starting at $14.99/user/month. Live captions are free. AI Companion is included on Pro+ tiers. Third-party alternatives: Otter Pro $16.99/month with unlimited transcription; MetaWhisp free for unlimited on-device, $30/year optional cloud features.
What's the difference between Zoom cloud recording and local recording? Cloud uploads the meeting to Zoom's servers, generates an automatic transcript, retains 30 days default. Local saves files (MP4 + M4A) directly to the host's Mac with no Zoom-server involvement, no automatic transcript. Local is the privacy-first option.
Can Zoom transcription handle multiple languages? Zoom supports 30+ languages for transcription. Live captions auto-detect for English and major European languages. AI Companion summaries are available in fewer languages. For multilingual meetings (multiple languages in one call), on-device tools using Whisper large-v3-turbo handle code-switching well.
---

About the author

AD

Andrew Dyuzhov

CEO & Solo Founder, MetaWhisp

I record client calls weekly. Half are under NDA. The rest are general-purpose. I built MetaWhisp partly because I couldn't find a clean, fast, on-device path for transcribing Zoom recordings without involving a third-party server.

This guide reflects the actual workflow I use, plus conversations with consultants, therapists, journalists, and sales leaders who picked different paths for different reasons. There is no single right answer — privacy posture, speaker diarization, AI summarization, and cost all trade off against each other.

What MetaWhisp adds for Zoom users on Mac:

  • Drag any M4A or MP4 from Zoom local recordings → instant on-device transcription
  • System-wide voice-to-text in any app (not just inside Zoom or Word)
  • 30+ languages with auto-detection
  • Free for unlimited local use; $30/year optional cloud features
  • HIPAA-compatible by architecture (no data transmitted)

If something in this guide is wrong or your workflow looks different, email me. Follow the build journey on X (@hypersonq).

---
↓ Download this guide as PDF
Related reading: