Which GPU provides ...
 
Notifications
Clear all

Which GPU provides the best performance for local AI model training?

3 Posts
4 Users
0 Reactions
126 Views
0
Topic starter

ive been staring at benchmarks for three hours and my head is spinning lol. i really need to get this local training rig built by next week for a freelance project here in california but the vram situation is just driving me nuts. im totally torn between grabbing a used rtx 3090 off ebay for that 24gb vram or just getting a new 4080 super for the clock speeds, but 16gb feels like such a trap for actual model training. my budget is capped at $1200 so a 4090 is basically out of reach unless i find a miracle. is the older 3090 actually the move here just for the extra memory or am i gonna regret the slower speeds compared to the 40 series?


3 Answers
12

Saw this today and honestly the 16gb on the NVIDIA GeForce RTX 4080 Super 16GB is a trap for training. You want that 384-bit memory bus on the NVIDIA GeForce RTX 3090 24GB GDDR6X cuz throughput matters way more than clock speed for weights.

  • 3090 handles larger batch sizes easily
  • 4080 super lacks headroom for LLM fine-tuning tho
  • 3090 has 936 GB/s bandwidth... much better for data-heavy tasks


11

I've been running local models for a while now and honestly, I am extremely satisfied with sticking to the 24GB cards. When you're training, you aren't just looking at how fast the cores are, you're looking at what actually fits in the memory buffer. If your model or your batch size doesn't fit in the VRAM, you're basically dead in the water or stuck with CPU offloading which is painfully slow. I'd definitely go for a NVIDIA GeForce RTX 3090 24GB GDDR6X over any 16GB card right now. Even though the NVIDIA GeForce RTX 4080 Super 16GB GDDR6X has the newer architecture and better power efficiency, that 8GB difference is massive. Its basically the difference between running a 13B parameter model with a decent context window and crashing every time you try to start a training run. Plus, if you find a good deal on something like an ASUS TUF Gaming GeForce RTX 3090 24GB, you can probably save enough of that $1200 budget to upgrade your power supply. Those 3090s are super power hungry so make sure you have a solid unit like the EVGA SuperNOVA 1000 G6 1000W 80 Plus Gold. The memory bandwidth on the 3090 is nearly 1 TB/s, which keeps data moving very smoothly during the heavy tensor operations. I've had no complaints with my training times. Just a quick tip, if you buy used, check the thermal pads because they can get pretty gross on older 30-series cards. Replacing them is an easy way to keep your VRAM temps from throttling while you're pushing the card for hours.





1

Honestly, 16gb is gonna bottleneck you fast. I would suggest going with a used NVIDIA GeForce RTX 3090 24GB GDDR6X for that extra VRAM. Just be careful buying used tho.

  • check seller ratings closely
  • ask about previous mining use
  • look for an EVGA GeForce RTX 3090 FTW3 Ultra 24GB for reliability. VRAM is king here, so you might want to consider the risk worth it.


Share:
PCTalkTalk.COM is a participant in the Amazon Services LLC Associates Program, an affiliate advertising program designed to provide a means for sites to earn advertising fees by advertising and linking to Amazon.com. As an Amazon Associate, I earn from qualifying purchases.

Contact Us | Privacy Policy