What's the Cheapest LLM for Coding?

Finding the cheapest LLM for coding requires balancing price with coding ability. Here are the most affordable coding-capable models, ranked by a weighted cost metric (60% input, 40% output):


  • Gemini Experimental 1206 (Google): $0.00/M input, $0.00/M output. Coding ELO: 1280. Speed: 100 tok/s.
  • Gemini 2.0 Flash Thinking (Google): $0.00/M input, $0.00/M output. Coding ELO: 1270. Speed: 100 tok/s.
  • Gemma 2 9B (Google): $0.030/M input, $0.090/M output. Coding ELO: 1155. Speed: 100 tok/s.
  • Amazon Nova Micro (Amazon): $0.035/M input, $0.140/M output. Coding ELO: 1100. Speed: 100 tok/s.
  • Command R7B (Cohere): $0.038/M input, $0.150/M output. Coding ELO: 1100. Speed: 100 tok/s.

  • For simple code generation and autocomplete, smaller models like GPT-4.1 Nano or Gemini 2.0 Flash Lite are extremely affordable. For complex multi-file refactoring and architecture decisions, investing in Claude Sonnet 4 or GPT-4.1 pays off in fewer iterations and better results.


    Cost-saving tips for coding workloads: Use prompt caching for system prompts with coding instructions. Batch non-urgent code reviews through batch APIs. Start with a cheaper model and only escalate to premium models for complex tasks.

    Related Questions