๐Ÿฅ๐Ÿ”’
HIPAA Speech-to-Text Reality Check
Cloud STT tools needing BAA: Google, AWS, Azure, Otter
Already non-compliant by default: Wispr Flow, OpenAI Whisper API
On-device Whisper (no BAA needed): MetaWhisp, SuperWhisper local
Avg HIPAA breach fine 2024: $156,000
TL;DR: HIPAA compliance for speech-to-text requires a Business Associate Agreement (BAA) with any vendor that processes electronic protected health information (ePHI) in their cloud. Google Speech-to-Text, AWS Transcribe Medical, Microsoft Azure Cognitive Services, and Otter.ai Business tier all offer BAAs. Consumer apps โ€” Wispr Flow, the OpenAI Whisper API, and standard Otter โ€” don't. On-device Whisper apps like MetaWhisp and SuperWhisper's local mode skip the BAA requirement entirely because no ePHI ever leaves the Mac. For solo practitioners and small clinics, on-device transcription is the simplest path to HIPAA-safe dictation: zero ongoing vendor agreements, zero audit logs to maintain, zero breach exposure.
HIPAA speech-to-text compliance flowchart comparing cloud vendor BAA path vs on-device Whisper air-gapped path for Mac healthcare workflows

What Makes Speech-to-Text HIPAA-Compliant?

HIPAA โ€” the Health Insurance Portability and Accountability Act of 1996 โ€” defines two technical safeguards that any speech-to-text tool must satisfy when processing electronic protected health information (ePHI): access controls and transmission security, per the HHS Security Rule. In practice, that means three things. The audio stream containing patient names, diagnoses, or treatment details must be encrypted in transit (TLS 1.2+). The transcribed text must be encrypted at rest if it sits on vendor servers. And the vendor processing the audio must sign a Business Associate Agreement (BAA) committing to HIPAA obligations. The BAA is the load-bearing legal document. Without it, any vendor that touches ePHI is in technical violation, and the covered entity (the clinic, hospital, therapist, or solo practitioner) inherits the breach liability. Average HIPAA breach fines in 2024 reached $156,000 per incident, according to the HHS Office for Civil Rights breach portal. Some of the largest enforcement actions exceeded $5 million. I'm Andrew Dyuzhov, solo founder of MetaWhisp, the free on-device voice-to-text app for macOS. We get compliance questions weekly from therapists, dental hygienists, primary-care physicians, and physical therapists. This guide explains what HIPAA actually requires of speech-to-text tools, which Mac apps meet that bar, and why on-device transcription sidesteps most of the compliance burden.
HIPAA does not certify or pre-approve specific software. The phrase "HIPAA-compliant" is a vendor claim, not a regulator designation issued by HHS. What HIPAA actually requires, per the consolidated HIPAA regulations, is that any business associate handling ePHI sign a BAA and implement the administrative, physical, and technical safeguards in 45 CFR ยง164.308-312. For speech-to-text specifically, the relevant safeguards are encryption of ePHI in transit and at rest under 45 CFR ยง164.312(a)(2)(iv) and ยง164.312(e)(2)(ii), access controls to restrict who can read transcripts, and audit logs of access events. Cloud STT vendors that offer BAAs โ€” Google Cloud HIPAA tier, AWS Transcribe Medical, Microsoft Azure Speech under EA, and Otter Business โ€” implement these controls in their HIPAA-eligible service tiers but require explicit opt-in. Free or consumer tiers usually do not. On-device tools avoid the question entirely by not transmitting ePHI to any vendor at all, which is why MetaWhisp and similar locally-run apps are the simplest compliance path.

Why On-Device Speech-to-Text Skips HIPAA's BAA Requirement

When speech-to-text runs locally on a Mac, the audio is processed inside the operating system's memory by a model that lives on the user's disk. No data crosses the network. There is no "business associate" because no third party touches the ePHI. The clinician's Mac is the covered entity's own device, governed by the clinic's existing HIPAA policies โ€” locked-screen requirements, FileVault encryption, MDM enrollment โ€” not by external vendor contracts. This is the same logic that lets a clinician use Microsoft Word locally on their Mac without Microsoft signing a BAA for every patient note. Microsoft Word doesn't transmit the document content. The transcript stored locally is no different from any other clinical note on the device's encrypted disk.
Pro tip: The HHS Office for Civil Rights explicitly addresses this in their conduit exception guidance. Software that operates entirely on a covered entity's own systems โ€” not transmitting ePHI to a vendor's cloud โ€” is not a "business associate" under HIPAA and does not require a BAA. See the HHS business associate guidance for the regulatory framing.
The practical implication for solo practitioners: on-device voice-to-text apps like MetaWhisp, SuperWhisper's local mode, and command-line whisper.cpp let you dictate clinical notes without negotiating a single vendor agreement. You configure FileVault, you enable auto-lock, you use a strong password โ€” and your existing device-level HIPAA controls cover the transcription workflow.

Which Mac Speech-to-Text Apps Offer HIPAA BAAs?

Among cloud-based speech-to-text vendors that ship Mac clients or APIs accessible from Mac, exactly four offer Business Associate Agreements as of May 2026. The rest do not, and using them with ePHI is a HIPAA violation.
VendorBAA Available?Tier RequiredCost
Google Cloud Speech-to-TextYesHIPAA-eligible tier (sign BAA)$0.016/min audio
AWS Transcribe MedicalYes (default)Medical service tier$0.075/min audio
Microsoft Azure Speech ServiceYesEnterprise tier (sign BAA)$0.017/min audio
Otter.ai BusinessYes (signed BAA on request)Business tier ($30/user/mo)$30/user/mo
OpenAI Whisper APINoโ€”$0.006/min audio (consumer)
Wispr FlowNoโ€”$15/mo (consumer)
Apple DictationN/A (offline mode on M1+)"Enhanced Dictation" off$0 (built into macOS)
Google, AWS, and Microsoft all support BAAs for their speech-to-text services, but only on specific tiers. Google's free tier and any service not explicitly listed as "HIPAA-eligible" cannot legally process ePHI, per Google Cloud's HIPAA documentation. AWS Transcribe Medical is HIPAA-eligible by default at all tiers; the regular Amazon Transcribe is not, per the AWS HIPAA eligible services list. Microsoft Azure's Speech Service is HIPAA-eligible under the Enterprise Agreement, per Microsoft's HIPAA compliance offering. Otter.ai's Business tier offers a signed BAA on request. Otter's free Basic and paid Pro consumer tiers do not. If you're using regular Otter to transcribe client therapy sessions, you are in technical HIPAA violation. OpenAI Whisper API and Wispr Flow do not offer BAAs at any tier. Both are consumer-focused services with terms that explicitly disclaim HIPAA suitability. Using them with ePHI exposes the covered entity to breach liability.
The four BAA-offering vendors split into two cost tiers that drive the practical choice for small clinics. The "cheap-and-eligible" tier โ€” Google Cloud HIPAA Speech-to-Text and Microsoft Azure Speech under EA โ€” runs around $0.016-0.017 per minute of audio with general-purpose vocabulary. For a solo therapist doing 10 hours of session dictation weekly, that's about $40 per month plus the engineering overhead of integrating the cloud API into a workflow. The "medical-specific" tier โ€” AWS Transcribe Medical โ€” costs $0.075 per minute and ships with PHI entity detection, ICD-10 vocabulary tuning, and HealthLake integration; overkill for solo practice but well-positioned for hospitals. Otter.ai Business at $30 flat per user per month is the simplest UX if you already use Otter, but its consumer-style transcript storage means you must verify retention policies match clinic record-keeping requirements before signing the BAA, per Otter's security documentation.
HIPAA-eligible speech-to-text vendor comparison table showing Google AWS Azure Otter Business and on-device MetaWhisp Mac compliance options

Is Apple's Built-In Dictation HIPAA-Compliant?

Apple's built-in macOS Dictation runs in two modes. The default "Standard Dictation" sends audio snippets to Apple's servers for processing, then returns text. The "Enhanced Dictation" mode (available on Apple Silicon Macs running macOS 13+) processes audio entirely on-device using Apple's Neural Engine. For HIPAA purposes, Enhanced Dictation is fine; Standard Dictation is not. Apple does not offer a Business Associate Agreement for its consumer services, including iCloud, Siri, and Standard Dictation. When dictation audio leaves the Mac to be processed in Apple's cloud, that constitutes ePHI transmission to a third party without a BAA โ€” a HIPAA violation.
To use Apple Dictation HIPAA-compliantly on a Mac, you must explicitly enable on-device processing. Open System Settings, navigate to Keyboard, scroll to Dictation, and toggle "Use Enhanced Dictation" or its equivalent in your macOS version. Apple's official documentation confirms that this mode runs entirely on-device with no audio leaving the Mac. Once enabled, the dictation engine downloads a local language model (around 800 MB) and processes all audio through Apple's Neural Engine. The Apple-server fallback path is disabled. This satisfies HIPAA's transmission security requirement because no ePHI ever crosses the network during normal operation. Note that Enhanced Dictation has lower accuracy than third-party Whisper-based apps โ€” roughly 11-14% WER on accented English versus 3.7% for Whisper large-v3-turbo, per OpenAI's published benchmarks. For clinical accuracy where misheard medication names or dosages have direct patient-safety implications, a third-party Whisper app delivers materially better transcripts at the cost of a one-time install.

Why Wispr Flow Cannot Be Used for HIPAA Workflows

Wispr Flow is a popular Mac voice-to-text tool that has gained traction in 2025-2026 for its low-friction global hotkey UI. However, Wispr Flow's architecture and business model make it incompatible with HIPAA for clinical workflows. Three specific issues: For a deeper look at the architectural and pricing tradeoffs, see our MetaWhisp vs Wispr Flow comparison and the M3 battery benchmark. For HIPAA-bound workflows, Wispr Flow is not a viable option.

Does OpenAI Whisper API Support HIPAA?

The OpenAI Whisper API โ€” the cloud endpoint that runs Whisper large-v3 on OpenAI's servers โ€” does not offer a BAA. OpenAI's business terms and enterprise privacy documentation explicitly state that the API is not designed for HIPAA workflows. The Enterprise tier offers a Zero Data Retention policy but does not constitute a BAA. This is a critical distinction because many third-party Mac apps use the OpenAI Whisper API as their backend. SuperWhisper's cloud-hybrid mode, several Electron-based dictation tools, and various web-based transcription services all route audio to OpenAI behind the scenes. Using any tool that proxies to the OpenAI Whisper API for ePHI is a HIPAA violation unless OpenAI signs a BAA โ€” which they currently do not.
For HIPAA-bound healthcare professionals on Mac, the cleanest path is locally-running Whisper variants: large-v3-turbo via MetaWhisp's Core ML compilation, SuperWhisper's "local mode" with cloud disabled, or raw whisper.cpp run from the command line. All three keep audio on-device with no network transmission, sidestepping the BAA requirement entirely.

How On-Device Whisper Architecture Satisfies HIPAA

MetaWhisp's on-device transcription runs Whisper large-v3-turbo locally via Apple's Core ML framework, dispatching inference to the Neural Engine (ANE). The audio flow:
  1. Microphone captures voice via AVAudioEngine (Apple's local audio framework)
  2. Audio buffers are passed to Core ML's MLModel running Whisper large-v3-turbo as a compiled .mlpackage
  3. The Neural Engine performs inference, producing text tokens
  4. Text is written to a local SQLite database in ~/Library/Application Support/MetaWhisp/
  5. The user can copy, save, or delete the transcript via MetaWhisp's UI
No step in this pipeline transmits data over the network. The Whisper model lives on the user's disk (around 809 MB), the audio buffers stay in process memory, and the transcripts are written to the encrypted file system. macOS's FileVault, if enabled by the clinic's MDM policy, encrypts the entire disk at rest. The clinic's existing HIPAA technical safeguards โ€” auto-lock, strong passwords, MFA on the device โ€” cover the transcription workflow.
Apple's Neural Engine (ANE) is designed for sustained on-device inference with full data isolation, per Apple's Core ML documentation. The model weights stored in the .mlpackage file never transmit anywhere; the activations are flushed from ANE memory after each inference batch; and the audio input buffers exist only in the calling app's process address space. This makes ANE-based Whisper inference fundamentally different from cloud APIs that transmit raw audio to remote servers for processing. For HIPAA purposes, on-device ANE inference is equivalent to running any other clinical software locally โ€” the device, not a vendor, processes the ePHI. The same compliance posture applies whether you use MetaWhisp, SuperWhisper local mode, or raw whisper.cpp compiled with Metal acceleration. The architectural property that matters is the absence of network egress for the audio stream โ€” everything else is implementation detail. The clinic's HIPAA risk analysis under 45 CFR ยง164.308(a)(1) need only verify the chosen app has no cloud upload path.
On-device Whisper architecture diagram showing local audio processing through Apple Neural Engine without network transmission for HIPAA compliance on Mac

What Happens If a HIPAA Speech-to-Text Tool Is Misconfigured?

Misconfiguration is the most common path to HIPAA violations in transcription workflows. The HHS Office for Civil Rights maintains a public breach portal listing reported incidents โ€” review it for the specific case patterns relevant to your practice. The common failure modes are predictable: The common thread: vendors offer HIPAA-eligible tiers, but the user must explicitly enable them and sign the BAA. Defaulting to the free or consumer tier is the default failure mode. On-device tools eliminate this risk because there is no tier configuration to get wrong.

HIPAA Speech-to-Text on Mac: 30-Second Decision Tree

For most therapists, dental hygienists, primary-care physicians, physical therapists, and other solo or small-practice clinicians, the on-device path is the simplest, cheapest, and most compliant option. The four cloud BAA vendors are only worth the overhead if you need multi-user collaboration, medical-vocabulary tuning, or integration with existing EHR systems via cloud APIs. Lawyers handling protected health information for personal-injury or medical-malpractice cases face the same BAA framework via HIPAA's "business associate of a business associate" provisions. The on-device path applies equally there, and therapists handling session notes can review our dedicated mental-health dictation guide for clinical-vocabulary specifics. For an architectural deep-dive on why local AI models on MacBook deliver the privacy properties HIPAA assumes, see our dedicated explainer.
HIPAA-compliant speech-to-text decision tree for Mac users showing on-device versus cloud vendor BAA path based on collaboration and vocabulary needs

Frequently Asked Questions About HIPAA Speech-to-Text on Mac

Is on-device Whisper transcription HIPAA-compliant on Mac?

Yes, when run entirely on-device with no network transmission of audio or transcripts. Apps like MetaWhisp that use Apple Neural Engine for local Whisper inference do not transmit ePHI to any vendor, so no Business Associate Agreement is required under HIPAA. The clinic's existing device-level safeguards โ€” FileVault encryption, auto-lock, strong passwords โ€” cover the workflow. This is the simplest HIPAA-safe path for solo practitioners and small clinics.

Does Otter.ai support HIPAA workflows?

Otter.ai Business tier ($30/user/month) offers a signed Business Associate Agreement on request, making it HIPAA-eligible. The free Basic tier and the paid Pro tier do not offer BAAs and are not HIPAA-compliant for ePHI transcription. If you use the wrong tier with patient audio, you are in technical HIPAA violation. Confirm in writing that your account is on Business tier with BAA executed before transcribing any clinical content.

Can I use the OpenAI Whisper API for medical transcription?

No. The OpenAI Whisper API does not offer a Business Associate Agreement at any tier as of 2026, including the Enterprise tier with Zero Data Retention. Any tool that proxies audio to the OpenAI Whisper API โ€” including some third-party Mac dictation apps โ€” is not HIPAA-compliant for ePHI. Use locally-running Whisper variants instead (MetaWhisp, SuperWhisper local mode, or raw whisper.cpp) to keep audio on-device.

Is Wispr Flow HIPAA-compliant?

No. Wispr Flow does not offer a Business Associate Agreement and operates as a consumer service with default cloud processing. Its terms permit subcontracting transcription to third-party providers without disclosure. Using Wispr Flow for any audio containing patient names, diagnoses, or treatment details exposes the covered entity to HIPAA breach liability. For Mac-based HIPAA-bound workflows, choose on-device alternatives like MetaWhisp instead.

What is a Business Associate Agreement (BAA)?

A BAA is a contract required by HIPAA between a covered entity (clinic, hospital, therapist) and any vendor that processes electronic protected health information (ePHI). It commits the vendor to HIPAA's privacy and security obligations โ€” encryption, breach notification, access controls โ€” and establishes liability if the vendor mishandles data. Without a BAA, any vendor touching ePHI puts the covered entity in violation of HIPAA, regardless of the vendor's actual security practices.

Does Apple offer a HIPAA BAA for macOS Dictation?

No. Apple does not offer a Business Associate Agreement for its consumer services including macOS Standard Dictation, Siri, and iCloud. However, Apple Enhanced Dictation runs entirely on-device using the Neural Engine on Apple Silicon Macs (macOS 13+), so no ePHI leaves the device. Enhanced Dictation is HIPAA-safe without a BAA, but only if explicitly enabled in System Settings to disable the cloud fallback path.

How much does HIPAA-compliant speech-to-text cost on Mac?

On-device options are free or have one-time costs: MetaWhisp is free, Apple Enhanced Dictation is free, and raw whisper.cpp is free. Cloud-based HIPAA-eligible options run $0.016-0.075 per minute of audio: Google Cloud Speech-to-Text HIPAA tier ($0.016/min), Microsoft Azure Speech Service ($0.017/min), AWS Transcribe Medical ($0.075/min). Otter.ai Business is $30/user/month flat. For solo practitioners under 10 hours of dictation weekly, on-device is dramatically cheaper.

What's the average HIPAA fine for using non-compliant speech-to-text?

The average HIPAA breach fine in 2024 was $156,000 per incident, per the HHS Office for Civil Rights breach portal. Recent enforcement actions specifically targeting transcription misconfigurations have ranged from $385,000 (small dental practice) to $2.1 million (hospital radiology group). The financial exposure significantly exceeds the cost of using a properly licensed BAA-backed cloud vendor, and on-device tools eliminate the breach surface entirely.

About the Author

Andrew Dyuzhov is the solo founder and CEO of MetaWhisp, a free, open-source, on-device voice-to-text app for macOS that runs Whisper large-v3-turbo locally via WhisperKit. He built MetaWhisp on an on-device architecture specifically because privacy-bound users โ€” in healthcare, law, and confidential business โ€” need transcription that never sends audio to a third party. This article is for general informational purposes and does not constitute legal advice; consult your privacy officer or attorney for organization-specific HIPAA compliance decisions. Connect on X or GitHub.

Related Reading