The Lens

By Erik Loyd, SaaS CEO and former COO/CFO of an AWS Premier Partner.

Updated Jun 2026

mlx-lm runs and fine-tunes large language models directly on a Mac. Point it at a model on Hugging Face and one command pulls it down and runs it locally, using Apple's own MLX engine instead of a cloud API or a separate GPU rig. MIT licensed, free, and built by Apple's own ml-explore team, the same group behind MLX itself.

It does more than run models. You can quantize them down to 4-bit, fine-tune with LoRA or full-model training, serve with streaming and prompt caching, and even split work across multiple machines. Setup is close to trivial: pip install mlx-lm, then a single command chats with a model. The real constraint is memory. MLX uses the Mac's unified memory, so the model has to roughly fit in RAM, and pushing past that needs macOS 15 or newer plus some system tuning. And it's Apple Silicon only. No M-series chip, no mlx-lm.

The honest framing on competition: this is a building block, not a finished app. llama.cpp is the closest peer and runs on more hardware; Ollama and LM Studio are more packaged and app-like, and increasingly use MLX under the hood anyway; vLLM is for datacenter GPUs, a different world. mlx-lm's edge is being the MLX-native option, which means the best raw performance on a Mac and the cleanest fine-tuning story. Solo developers and researchers on Apple Silicon: this is the fast path. Small teams can build on it; larger production serving will want something server-side.

The catch is that you're trading convenience and reach for Mac-native speed. It's lower-level than Ollama, locked to Apple hardware, and capped by how much RAM you bought. Within those lines, nothing runs models on a Mac better.

Explore Further

GitHub Repository

Source code, issues, README

Reddit Discussions

Community opinions and use cases

Hacker News

HN threads and discussions

Dev.to Articles

Tutorials and write-ups

Tutorials & Guides

Getting started resources

Official Website

Docs, blog, and more

mlx-lm

The Lens

Free vs Self-Hosted vs Paid

Trust Signals

License: MIT License

About

Also by ml-explore

Explore Further

More tools in the directory

QwenPaw

phoenix

lance