What's the Cheapest LLM for Coding?

Finding the cheapest LLM for coding requires balancing price with coding ability. Here are the most affordable coding-capable models, ranked by a weighted cost metric (60% input, 40% output):

Gemini Experimental 1206 (Google): $0.00/M input, $0.00/M output. Coding ELO: 1280. Speed: 100 tok/s.

Gemini 2.0 Flash Thinking (Google): $0.00/M input, $0.00/M output. Coding ELO: 1270. Speed: 100 tok/s.

Gemma 2 9B (Google): $0.030/M input, $0.090/M output. Coding ELO: 1155. Speed: 100 tok/s.

Amazon Nova Micro (Amazon): $0.035/M input, $0.140/M output. Coding ELO: 1100. Speed: 100 tok/s.

Command R7B (Cohere): $0.038/M input, $0.150/M output. Coding ELO: 1100. Speed: 100 tok/s.

For simple code generation and autocomplete, smaller models like GPT-4.1 Nano or Gemini 2.0 Flash Lite are extremely affordable. For complex multi-file refactoring and architecture decisions, investing in Claude Sonnet 4 or GPT-4.1 pays off in fewer iterations and better results.

Cost-saving tips for coding workloads: Use prompt caching for system prompts with coding instructions. Batch non-urgent code reviews through batch APIs. Start with a cheaper model and only escalate to premium models for complex tasks.

What's the Cheapest LLM for Coding?

Related Tools

Related Questions

How Much Does Claude API Cost?

ChatGPT vs Claude: Which Is Better?

Best LLM API for Production Use

LLM API Pricing Comparison — Complete Guide

How to Reduce LLM API Costs

Which LLM Has the Largest Context Window?

Fastest LLM API — Speed Comparison