Gemini 2.5 Flash-Lite API Pricing

Gemini 2.5 Flash-Lite costs $0.10/1M input and $0.40/1M output with a 1M context window. The cheapest Gemini 2.5 model for high-volume workloads. Compare vs Gemini 2.0 Flash-Lite and other budget LLM APIs.

Gemini 2.5 Flash-Lite is Google's budget-friendly model, best suited for lowest cost gemini, high-volume tasks. It costs $0.1 per 1M input tokens and $0.4 per 1M output tokens, with a 1M token context window. At typical usage (10K input tokens, 2K output tokens per call, 1,000 calls per day), Gemini 2.5 Flash-Lite costs approximately $54 per month. With prompt caching enabled at a 90% discount, cached input drops to $0.01/1M tokens — significant for applications with repeated system prompts. For lower costs, Gemini 2.0 Flash-Lite (Google) offers input at $0.075/1M. For higher capability, DeepSeek V4 Flash (DeepSeek) costs $0.14/1M input. At this price point, Gemini 2.5 Flash-Lite is ideal for high-volume production workloads: classification, extraction, summarization, and chatbots where cost per query matters more than peak intelligence. On a per-request basis, sending 1,000 input tokens to Gemini 2.5 Flash-Lite costs $0.0001, and generating 1,000 output tokens costs $0.0004. A typical chatbot exchange (500 tokens in, 300 tokens out) runs about $0.00017 per message. At scale, small per-request cost differences compound quickly — a model that costs 2x more per token costs 2x more at any volume. All pricing shown here is sourced from Google's official pricing page and verified regularly. LLM providers may change pricing without notice. Always confirm current rates on the provider's website before making purchasing decisions. The cost calculator on this page lets you estimate monthly spending based on your actual token usage and call volume. Google applies standard rate limits to Gemini 2.5 Flash-Lite API keys. Check the provider dashboard for your current tier and request higher limits if needed. To get started, create an API key from the Google developer console, install the provider's SDK (google-generativeai npm package), and make your first API call with a small prompt to verify connectivity and measure actual latency. Most providers offer a free tier or credits for new accounts — use these to benchmark Gemini 2.5 Flash-Lite against your specific workload before committing to a paid plan.

FAQ

  • How much does Gemini 2.5 Flash-Lite cost per 1M tokens?

    Gemini 2.5 Flash-Lite costs $0.1 per 1M input tokens and $0.4 per 1M output tokens. Cached input tokens are available at $0.01 per 1M, a 90% discount.

  • Is Gemini 2.5 Flash-Lite cheap or expensive?

    Gemini 2.5 Flash-Lite is one of the more affordable LLM APIs at $0.1/1M input tokens. It competes with other budget models for high-volume workloads.

  • What is the context window of Gemini 2.5 Flash-Lite?

    Gemini 2.5 Flash-Lite supports a context window of 1M tokens. This determines how much text you can send in a single API call — including system prompts, conversation history, and the actual query.

  • Does Gemini 2.5 Flash-Lite support prompt caching?

    Yes. Google offers cached input at $0.01/1M tokens — a 90% discount over the base input price. This helps with repeated system prompts and few-shot examples.

  • How to reduce Gemini 2.5 Flash-Lite API costs?

    Three strategies: (1) Enable prompt caching if your provider supports it — savings of up to 90% on repeated input. (2) Route simple queries to cheaper models. (3) Reduce output tokens with concise instructions.

  • How much does one Gemini 2.5 Flash-Lite API call cost?

    A typical request with 500 input tokens and 300 output tokens costs approximately $0.00017. The exact cost depends on your prompt length and desired response length. Use the cost calculator above to estimate for your specific usage pattern.