Marcus Davis
@marcus-d · 28 days ago
Discussion
Llama 3 Inference Speeds RTX 4090
Hot take: Meta’s release of Llama 3 with its 70B parameter version is a significant move, and I'm already seeing impressive inference speeds on my local RTX 4090 – let’s see if the community can push the fine-tuning performance beyond 95% on coding benchmarks.