How to transcribe audio using OpenAI's Whisper locally on your Apple Silicon Mac

As an investigator, you probably don't want to be sending your source audio across the internet to unknown organizations, even if they tout private endpoints, etc.

The great thing about modern Apple Silicon computers is that you can use Apple's MLX framework to take advantage of the unified hardware, and so transcribe very quickly on your own machine.

And you can use the uv package manager to make running your script super simple.

Here's the Python script you'll want. Let's save it as whisper.py somewhere obvious, perhaps in the same directory as the file(s) you want to transcribe.

import mlx_whisper
import sys
import os.path

filename = sys.argv[1]

if not os.path.exists(filename):
    print(f"Error: File '{filename}' does not exist")
    sys.exit(1)

result = mlx_whisper.transcribe(
    filename,
    path_or_hf_repo='mlx-community/whisper-large-v3-turbo',
    language='en',
    word_timestamps=False
)

transcription = result["text"]

output_file = filename.rsplit('.', 1)[0] + '.txt'

if output_file:
    with open(output_file, 'w', encoding='utf-8') as f:
        f.write(transcription)
    print(f"Transcription saved to {output_file}")

You can then run it by entering the following into your Terminal.

uv run --with mlx_whisper whisper.py filename.m4a

And you'll get a text file filename.txt with the transcription. (You can set parameters such as language in the script, and even get timestamps if useful by manipulating the output result variable.)

Subscribe to AI for Investigation

Don’t miss out on the latest issues. Sign up now to get access to the library of members-only issues.
jamie@example.com
Subscribe