~/llm will-it-run-it

Will it run it?

Pick your GPU or type your VRAM, and instantly see which open AI models your machine can actually run — how fast they'll generate, the best quantization, and a one-click Hugging Face download for each.

0Models

0GPUs

VRAM · KV-cache · bandwidthPhysics engine

Your hardware

Select GPU

— OR —

VRAM / usable memory (GB)

Bandwidth drives the tokens/sec estimate. Auto-fills when you pick a GPU.

System RAM for offload (GB, optional)

Lets models spill into system memory — runs bigger models, slower. Unified-memory chips pay no penalty.

Test any model

Hugging Face repo or model name

Paste any repo — we'll read its size (and config when available) and estimate the fit on your hardware.

Filters

Minimum context

Minimum speed (tokens/sec)

Required capabilities

Results

Select a GPU or enter your VRAM to begin.

Sort