💻🎙️

Best Local Whisper Model for 8GB MacBook Air

Published sizes · RAM needs · Real M1 Air test

Short answer: OpenAI's Whisper family ships in five official sizes plus a turbo variant, but on an 8GB MacBook Air only a few fit comfortably with macOS using the rest of your RAM. Whisper large-v3-turbo is the best balance — close to large-v3 accuracy, roughly half the size. Medium is the underrated runner-up. Small is the safety net. Full large-v3 technically loads but causes swap the moment you open a second app.

This guide is built on published model files from OpenAI, the WhisperKit benchmarks, and my own daily use of an 8GB M1 MacBook Air. If you want a no-setup shortcut, MetaWhisp ships the turbo model out of the box.

MacBook Air 8GB unified memory diagram showing how Whisper model size determines whether the system stays in RAM or starts swapping

Why model size matters on an 8GB MacBook Air

The 8GB MacBook Air is one of Apple's most popular laptops, and it's also the configuration that runs into the most Whisper headaches. Whisper is a memory-hungry model. Unlike apps that stream from disk, an ASR model has to keep its weights in RAM so the CPU, GPU, and Apple Neural Engine can read them fast. On Apple Silicon, all three live in the same unified memory pool — which means your model, macOS, your browser tabs, and your chat client are all fighting over the same 8GB. The practical floor: after macOS takes its share, you've got a few GB left for actual work. WhisperKit and CoreML add their own overhead on top of the model. So if a model weighs a few GB on disk, the working set during a transcription grows well beyond the file size, which is exactly the territory where macOS starts swapping to SSD. Swap on an Air means dropped audio buffers and transcription that feels sluggish instead of instant. The fix isn't "buy a 16GB Air" — though that does help. The fix is picking a Whisper model whose published size leaves real headroom for the rest of your machine.

What published sizes do each Whisper model actually have?

OpenAI published five sizes in the original Whisper repo, then added a turbo distilled variant in late 2024. Here is the family, with the numbers straight from the openai/whisper README and the Hugging Face model card for the turbo:
ModelParametersPublished sizeRepo source
tiny39M~75 MBopenai/whisper
base74M~142 MBopenai/whisper
small244M~466 MBopenai/whisper
medium769M~1.42 GBopenai/whisper
large-v31.55B~2.83 GBopenai/whisper
large-v3-turbo809M~1.5 GBopenai/whisper-large-v3-turbo
Two things stand out. First, large-v3-turbo keeps the encoder layers from large-v3 and shrinks the decoder — that's why it's roughly half the size of the full large-v3 but stays much closer in accuracy than the parameter count alone suggests. Second, there is a big jump between medium (1.42GB) and large-v3 (2.83GB) that no other model in the family fills. That is the jump that decides whether you swap or you do not.

What published Whisper model sizes should I trust? The numbers above come directly from OpenAI's whisper repository README and the Hugging Face model card for the turbo. Treat these as the floor, not the ceiling — CoreML-converted versions (which WhisperKit uses) are similar in size, and the working RAM footprint is somewhat larger than the published file size once you add buffers and KV cache. The ~1.5 GB listed for large-v3-turbo is the on-disk (uncompressed) size; the actual download is smaller because it ships compressed (roughly 950 MB), then unpacks to that on-disk size.

Pro tip: Model size on disk is not the same as RAM footprint. Plan for somewhat more than the .pt size as a working-set estimate on macOS — CoreML graphs, audio buffers, and KV cache all sit next to the weights.

How much RAM does each Whisper model really need?

There is no official "minimum RAM" spec for Whisper. What there is, though, is community-reported behavior across GitHub issues and the WhisperKit benchmark table. The rule of thumb I have come to trust is: The reason large-v3 struggles on 8GB Airs isn't really "it doesn't load" — modern macOS will let it load and run. The reason is that with a browser, a chat client, and Mail open, the model starts competing with macOS's compressed-memory pool. The SSD swap helps, but a MacBook Air's SSD is slower than a Pro's, and once you start swapping audio buffers, transcription latency rises from "feels instant" to "I'll wait."
Whisper model RAM pressure zones for 8GB MacBook Air showing comfort tight and swap thresholds

Where does WhisperKit fit in this story?

WhisperKit is argmaxinc's open-source Swift package that wraps Whisper and runs it through Apple's CoreML stack on the Apple Neural Engine. It is what makes local Whisper practical on Apple Silicon in the first place — without it, you would be running PyTorch on CPU, which is substantially slower than the CoreML path on the Apple Neural Engine. Two things matter for an 8GB Air:
  1. WhisperKit ships pre-converted CoreML models. You do not compile the .pt files yourself. You download a folder that is already optimized for ANE, which is faster to load and uses less peak memory than a freshly-converted model.
  2. WhisperKit publishes its own benchmark table for accuracy and speed across model variants. That is the reference I would trust over a one-off blog post, and it is worth reading before you commit to a model.
If you are building your own app, WhisperKit is the right starting point. If you are just trying to dictate, you want something that already uses WhisperKit under the hood so you don't have to manage the CoreML conversion yourself.

Why does WhisperKit matter for an 8GB MacBook Air? WhisperKit moves Whisper off the CPU and onto the Apple Neural Engine, which is dramatically faster and uses less peak RAM than a PyTorch CPU fallback. For 8GB Air owners specifically, the win is twofold: faster transcription (so the model is in memory for less time) and lower working-set overhead (so macOS has more headroom for everything else).

WhisperKit on Apple Neural Engine versus PyTorch CPU path comparison diagram

Which Whisper models actually run on 8GB of unified memory?

The short answer: tiny, base, small, medium, and large-v3-turbo all run cleanly on an 8GB MacBook Air. Full large-v3 loads but causes swap under normal workloads. distil-large-v3 also works (similar footprint to large-v3-turbo). For most people, the realistic choices are large-v3-turbo, medium, and small.

Here is the breakdown I would actually recommend: 1. large-v3-turbo — the default I'd pick. OpenAI's model card lists it at 809M parameters, with accuracy close to full large-v3 on standard test sets. The file is half the size of the full large-v3. On an 8GB Air with a normal workload, it leaves meaningful headroom for everything else. This is what MetaWhisp ships, and it is why the "just use large-v3" advice you see on Hacker News does not apply to 8GB Air owners. 2. medium — the underrated choice. If you have an older M1 Air and you do not transcribe anything noisy, medium is genuinely good. Published WER is similar to large-v3-turbo on clean audio and a bit worse on noisy audio. File size ~1.42GB leaves more headroom than the turbo on the same machine. The reason it gets overlooked is the name — but on an 8GB Air, medium often beats large. 3. small — the safety net. If you transcribe short voice memos and want the laptop to feel like nothing happened, small at 466MB is hard to beat. Accuracy on clean English drops noticeably compared to the turbo, but it is still better than Apple's built-in Dictation, and it is the one model that will run while you have many other apps open. Skip large-v3 on 8GB unless you are stress-testing. It will load. It will work. It will also swap the moment you open a second app, and the lag will make you think Whisper is "slow." It is not — the model is just too big for the machine.
ModelBest forOn 8GB MacBook Air
tinyEmbedded use, quick testsFine, but accuracy is weak
baseSlightly better than tinyFine
smallVoice memos, low RAM headroomYes — solid fallback
mediumQuiet rooms, fewer apps openYes — underrated sweet spot
large-v3-turboBest balance for 8GBYes — default recommendation
large-v3Max accuracy, lots of RAMSwap risk under real workload
Whisper model compatibility matrix for 8GB MacBook Air with OK and swap status flags

My real-world test on an 8GB M1 MacBook Air

I use an 8GB M1 Air as my daily driver — partly because it is the cheapest Air Apple sells, partly because I wanted to feel the pain 8GB owners feel. As an informal anecdote, not a benchmark: I dictated the same short English passage into MetaWhisp a few times, switching models between runs, and watched Activity Monitor while it ran. I also tried large-v3 once, with everything else closed. It worked. The moment I opened Mail, swap pressure spiked and the next dictation took several times real-time to complete. That was the moment I decided MetaWhisp's default should be large-v3-turbo, not the full large-v3. These are personal numbers from one machine and one audio sample — not a benchmark. I write that out more carefully in my Whisper large-v3-turbo deep dive.
Activity Monitor-style readout of Whisper model RAM and swap usage on 8GB M1 MacBook Air

How MetaWhisp picks a Whisper model for 8GB Airs

MetaWhisp ships WhisperKit with Whisper large-v3-turbo as the default. The reasoning is exactly the tradeoff above: on a 16GB or 24GB machine, large-v3 is the obvious choice. We default to the assumption that a user might be on an 8GB Air — the tightest common configuration — and on that machine large-v3-turbo gives near-large accuracy with the RAM headroom macOS needs to stay snappy. (MetaWhisp collects zero telemetry, so this is a deliberate design choice for the hardest case, not a measurement of what our users run.) The local mode is free and unlimited — no account, no time caps, audio never leaves your Mac, no telemetry. AI polish and translation work free on-device with your own OpenAI or Cerebras API key; if you'd rather not manage a key — or want cloud transcription — that is the Pro tier at $30/year or $7.77/month. Pro caps daily cloud minutes because cloud STT costs us real money, but the local model has no such cap. You can dictate for hours on end on the turbo and the only thing that runs out is your battery. If you have got a beefier machine and want the absolute best accuracy, you can swap to large-v3 in MetaWhisp's settings — but the on-device transcription defaults are tuned for the lowest common denominator, which is an 8GB Air.
Heads up: MetaWhisp does not have an iOS app yet. The whole stack is macOS-only as of June 2026. iOS is planned for later this year — subscribe to updates from the download page if you want to know when it ships.

How to install large-v3-turbo on your 8GB Air

If you want to use large-v3-turbo outside of MetaWhisp, the cleanest path is WhisperKit:
  1. Clone argmaxinc/WhisperKit and build from source (or use the Swift Package Manager integration in your own app).
  2. Download the large-v3-turbo CoreML model from the WhisperKit releases page — the download is roughly 950 MB compressed, which unpacks to the ~1.5 GB on-disk size listed in the table above.
  3. Run a sample transcription against a local audio file. Confirm Activity Monitor shows the process holding within available headroom.
  4. If it swaps, close Safari tabs and try again. If it still swaps, drop to medium and accept the small accuracy hit.
For most people the easier path is downloading MetaWhisp, which downloads the same WhisperKit model the first time you launch it and configures the global hotkey for you. No command line, no Xcode, no Swift package graph to wrestle with. Free, unlimited local mode, audio never leaves your Mac. If you want to compare models head-to-head without re-recording audio, MetaWhisp's processing modes let you switch between local and cloud, and between plain transcription and AI-polished output, on the same recording.
MetaWhisp workflow diagram showing WhisperKit local path versus cloud polish path on a Mac

FAQ: Whisper models on an 8GB MacBook Air

Can an 8GB MacBook Air run Whisper at all?

Yes. Whisper runs on any Apple Silicon Mac, including the 8GB Air. The question is which model — large-v3-turbo, medium, and small all run cleanly; full large-v3 will cause swap under normal workloads. The whole point of MetaWhisp's on-device transcription is to make this work without configuration.

Which Whisper model is best for an 8GB MacBook Air?

For most people, large-v3-turbo. It matches large-v3's accuracy closely while leaving enough RAM headroom for the rest of macOS. Medium is a close second if you want even more headroom; small is the safety net if you have lots of other apps open.

Is large-v3 too big for 8GB RAM?

It is not too big to load — it is too big to run comfortably while you do anything else. The published model size is ~2.83GB and the working set grows well beyond that, which leaves little room for macOS. Expect swap if Mail or Safari is open, and expect transcription latency to spike.

Does Whisper work offline on a MacBook Air?

Yes. Once the model is downloaded, Whisper runs entirely on-device through Apple's Neural Engine. No internet connection is needed for transcription. Offline voice-to-text on a MacBook is exactly what WhisperKit was designed for.

How much RAM does Whisper large-v3-turbo need?

Plan for somewhat more than the ~1.5GB model file once you add CoreML graphs and audio buffers. That is why it fits on an 8GB Air while full large-v3 does not.

Is Whisper large-v3-turbo better than large-v3?

It is better on an 8GB machine because it actually fits. On a 24GB+ machine, full large-v3 is slightly more accurate on published test sets, but the difference on clean English audio is small in practice.

What about distil-large-v3?

It is a community-distilled large-v3 with roughly the same footprint as large-v3-turbo and similar accuracy. WhisperKit supports it. It is a fine choice, but most users will not notice a meaningful difference versus the OpenAI turbo.

Will upgrading to 16GB unified memory fix everything?

It fixes the swap problem and lets you run full large-v3 comfortably. It does not change accuracy dramatically — large-v3-turbo and large-v3 are close on clean audio. If your machine is already 8GB, picking the right model is a better fix than buying new hardware.

About the author

I'm Andrew — I built MetaWhisp because I needed a voice-to-text app that did not send my dictation to someone else's server. I have ADHD, so I write a lot by talking, and I got tired of paying monthly for tools that resold my audio. MetaWhisp runs Whisper locally by default, with the model that fits the machine you actually have. The free local tier never expires.

Keep reading