Fireworks AI

Llama 3.1 405B (Fireworks)

Complete specs, pricing, and benchmark data for Llama 3.1 405B (Fireworks) by Fireworks AI. Last verified 2026-04-20.

JSON ModeFunctionsStreaming
Pricing

Input / 1M tokens

$3.00

Output / 1M tokens

$3.00

Context & Output

Context Window

131.072K

Max Output

4,096

TTFT

150ms

Speed

100 tok/s

Benchmarks

Arena ELO

1240

Coding ELO

1200

Reasoning ELO

1250

HumanEval

89.5

MMLU

85.9

MATH

112

GPQA

45

Price History (Input $/M tokens)

Strengths
  • +Largest model
  • +High quality
  • +Large context
Limitations
  • -Slow inference
  • -Expensive for throughput

Best For

General PurposeReasoning

Compare Llama 3.1 405B (Fireworks) with...

Official Pricing Page →
Your ad here