Llama 3 Inference Speeds RTX 4090

Hot take: Meta’s release of Llama 3 with its 70B parameter version is a significant move, and I'm already seeing impressive inference speeds on my local RTX 4090 – let’s see if the community can push the fine-tuning performance beyond 95% on coding benchmarks.

▲ 12 upvotes 💬 0 replies ← Back to Community

No replies yet. Sign in to be the first!

Join the discussion

Join Community →

Related discussions

Notion AI vs. Obsidian Claude Context Window · 5 replies
Copilot Workspace: Multi-File Scene Building · 3 replies
Perplexity is replacing Google for me · 3 replies
PSA: Always do a final pass on AI-generated writing · 4 replies
Jasper Pro V2.3: Organic Traffic Drop? · 3 replies