Which GPU is best for AI on a small PC?

Question

I’m putting together a small form factor PC mainly for AI stuff (local LLMs + some Stable Diffusion), and I’m stuck on which GPU makes the most sense in a tight case. I have room for a 2-slot card and I’m trying to keep power/heat under control (SFX PSU, limited airflow). VRAM seems like the biggest limiter, but I’m not sure how much performance I’d be giving up if I prioritize VRAM over raw GPU speed. Budget is roughly $600–$900, and I’d prefer something that doesn’t sound like a jet engine. Given these constraints, which GPU would you recommend for AI on a small PC, and why?

qexrgjskjo · Accepted Answer

Oh man, I feel u — SFF + AI is basically “VRAM Tetris” with heat as the timer lol. For your situation, I’d prioritize VRAM + efficiency over peak FPS. For local LLMs and SD, running out of VRAM is the hard stop way before “raw shader speed” matters.

Here’s what I’d do (years of trying to make quiet small boxes not melt…):
- **Best all-around pick:** NVIDIA GeForce RTX 4070 Ti SUPER 16GB — 16GB is the sweet spot for comfy SD and decent-sized LLM quant runs, and it’s usually way easier to undervolt than people think. In a 2-slot card, it’s not always available, but if you can find a true 2-slot model it’s kinda amazing.
- **Quieter / easier SFF pick:** NVIDIA GeForce RTX 4070 SUPER 12GB — less VRAM, yeah, but it’s efficient and tends to behave in tight cases. If you mostly do 7B–13B quants and SD at sane resolutions, it’s fine.
- **If you can snag a deal used:** NVIDIA GeForce RTX 3090 24GB — VRAM monster, but tbh it’s a heat/power diva in SFF. Only recommend if you’re ok undervolting + accepting more fan noise.

Performance hit vs VRAM? Honestly, going from “faster GPU” to “enough VRAM to not offload” usually feels faster in real use. Undervolt + set a power limit and you’ll dodge the jet engine problem. good luck, cheers

rgvlxrdfok · Answer

+1 to reply #2 — BIG warning: dont chase “faster core” and end up VRAM-starved; you’ll hit OOM and everything tanks. In SFF, also avoid triple-fan monsters + high-TDP cards: heat/noise spikes hard, and you’ll throttle anyway.

yqzqnvplhh · Answer

Pro tip: before u buy, run ur exact LLM/SD targets through a VRAM estimator + benchmark lists. For SD/LLMs in SFF, it’s basically VRAM > peak speed.

Option A: NVIDIA GeForce RTX 4060 Ti 16GB — coolest/quietest-ish in 2-slot, 16GB is actually usable for 7B/13B + SD, but perf per $ is… meh.

Option B: NVIDIA GeForce RTX 4070 SUPER 12GB — way faster, but 12GB is the hard wall (bigger SD models / higher res / bigger context). Also many cards are chunky.

Option C: used NVIDIA GeForce RTX 3090 24GB — VRAM king, but heat/noise in SFF is rough.

Resources: check Hugging Face model cards for VRAM notes, and use TechPowerUp GPU database + local LLM VRAM calculators (like “LLM VRAM calculator” search) to sanity-check. gl!