The 5 Best AI Cost Tracking Tools (February 2026)

If you are running AI in production, you already know that costs can spiral fast. A single model change, a batch job gone wrong, or one power user can double your monthly bill overnight. The fix is not spending less on AI. It is knowing exactly where every dollar goes, in real time, so you can make informed decisions.

The problem is that most teams are still tracking costs in spreadsheets, guessing at per-user margins, or waiting for end-of-month invoices to figure out what happened. A new category of tools has emerged to solve this. But they are not all built the same way, and the differences matter more than most teams realize.

Here are the five best AI cost tracking tools in 2026, what each one actually does with your cost data, and how to pick the right one for your stack.

What Is AI Cost Tracking?

AI cost tracking is the process of measuring, attributing, and reporting the real-time costs of LLM API calls across providers, models, and users so teams can optimize spend and price their products accurately.

How We Evaluated

We compared each tool on seven dimensions: integration method, model coverage, cost attribution, caching, spend controls, whether the tool connects to billing, and overall fit for different team profiles.

Tool	Integration	Models Tracked	Free Gateway	Caching	Open Source	Spend Controls	Margin Analysis	Built-in Billing	Best For
Lava	Proxy	600+	✓	✕	✕	✓	✓	✓	Track costs and bill for them
Helicone	Proxy + Async	300+	10K/mo free	✓	✓	✓	✕	✕	Observability + caching
LangSmith	SDK wrapping	Major only	5K traces free	✕	✕	✕	✕	✕	Agent trace debugging
Portkey	Proxy	2,300+	10K logs free	✓	Partial	✓	✕	✕	Enterprise governance
Keywords AI	Proxy + Async	250+	2K logs free	✕	✕	✕	✕	✕	Quick unified API

Lava

Lava tracks AI costs the same way the other proxy-based tools on this list do: you route requests through the Lava Gateway, and every request is logged with provider, model, token counts (input, output, and cache tokens separately), and cost. You get a cost-by-provider breakdown over time, a cost-by-model table with cost per 1K tokens, and per-customer cost attribution through wallet associations.

Where Lava differs from pure observability tools is what happens next. The cost data feeds directly into Lava Monetize, which turns those costs into revenue. Merchants see a margin dashboard that shows revenue minus AI costs with a percentage indicator. No other tool on this list gives you that view, because no other tool knows both what you are spending and what you are earning.

Lava also gives you internal spend controls through spend keys: API keys with per-key spending limits and model restrictions for your agents and services, enforced at the gateway layer. If you are running multiple internal services that call AI providers, spend keys let you cap what each one can spend.

The margin view no one else has

Helicone, Portkey, and LangSmith show you what you are spending on AI. Lava shows you what you are spending AND what you are earning from it. That margin view, revenue minus AI costs, is what you actually need to run a profitable AI product. It only exists when cost tracking and billing live in the same system.

For cost attribution, Lava automatically associates every request with the customer whose wallet funded it. You do not need to pass user ID headers or metadata tags. The attribution happens because the request is authenticated against a specific wallet.

What Lava does not do: There is no response caching (Helicone and Portkey both offer this). It is not open source or self-hostable. And the cost tracking dashboards are practical but not as deep as Helicone's observability suite or LangSmith's trace-level debugging. If your primary need is optimizing prompt costs or debugging agent behavior, those tools give you more specialized depth.

Pricing: The gateway and all cost tracking are free. No per-request fees. Lava charges a service fee only when you use Monetize to bill your end users.

Helicone

Helicone is an open-source LLM observability platform (YC W23) that provides the deepest cost monitoring experience on this list. The primary integration is proxy-based: swap your base URL to ai-gateway.helicone.ai, add an auth header, and Helicone starts logging every request with latency, cost, and token counts. They also offer an async logging path where you fire-and-forget log events via their SDK, which keeps the proxy off your critical path.

The distinction between proxy and async matters for cost accuracy. Through the proxy, Helicone has full visibility into the request and response, and calculates costs precisely using their Model Registry. Through async logging, costs are best-effort estimates based on token counts and model detection.

Helicone maintains pricing for 300+ models in an open-source community repository on GitHub. This is transparent and auditable, but it means pricing updates for new models can lag until someone contributes an update.

For attribution, Helicone uses Custom Properties via headers. You pass Helicone-User-Id or Helicone-Property-[Name] to tag requests with user IDs, feature names, plan tiers, or any other dimension. This gives you flexible, multi-dimensional cost slicing that you configure yourself.

Where Helicone genuinely shines is caching. Its Redis-based caching layer supports exact match and semantic caching at the gateway. Cache hits cost $0 because no request reaches the provider. Teams report cache hit rates of 70%+ on repetitive queries, which translates to real dollar savings that show up directly in the dashboards. No other tool on this list offers this, and for workloads with repetitive prompts it is a significant advantage.

Cost per cached response

Helicone's caching layer can eliminate redundant API calls entirely

Where it falls short: Helicone tracks costs for your internal visibility. It does not help you bill your customers, enforce spending limits in dollar terms, or show you margins. If you need to pass AI costs through to end users, you need a separate billing system.

Pricing: Free tier with 10K requests/month. Growth plan is usage-based. Open source (Apache 2.0) and self-hostable.

LangSmith

LangSmith is LangChain's observability and evaluation platform. It is fundamentally different from the gateway-based tools because it collects data via SDK instrumentation, not by proxying requests. You wrap your LLM calls with their tracing SDK using wrap_openai, wrap_anthropic, or the @traceable decorator. For LangChain users, tracing is auto-instrumented. For everyone else, you manually wrap each call.

This means LangSmith does not sit in the request path. It collects telemetry asynchronously via a callback handler. Your application performance is not impacted, but every LLM call you do not wrap is invisible to LangSmith.

The strength of LangSmith is something no other tool on this list can do: trace-level cost breakdown for multi-step agents. Because it captures the full execution tree of an agent (every LLM call, tool call, retrieval step), you can see exactly how much each step costs. "My agent spent $0.12 on the planning step, $0.03 on tool calls, and $0.45 on the final synthesis." That level of granularity is genuinely useful for optimizing agent architectures and prompt chains.

Cost calculation uses a model pricing map that ships with prices for OpenAI, Anthropic, and Gemini models. For anything else, you manually configure custom pricing by specifying regex match patterns and per-token prices. When providers change their rates, you update the map yourself.

Where LangSmith wins

If you are building agents with multi-step execution (tool calls, chain-of-thought, retrieval augmented generation), LangSmith's trace-level cost breakdown is unmatched. No proxy-based tool can show you which step of your agent pipeline is eating your budget.

Where it falls short: Costs are organized by project, trace, and run, not by end user. Per-customer attribution requires manual metadata tagging. Users report accuracy issues with AWS Bedrock models, reasoning tokens, and agent tool calls. There are no spend controls, no budget enforcement, and no billing capabilities. And the SDK wrapping requirement means any call you miss goes unmetered.

Pricing: Free plan with 5K traces/month and 1 seat. Plus plan at $39/seat/month with extra traces at $0.50 per 1K. Extended retention (400-day) at $5 per 1K traces.

Portkey

Portkey has the most comprehensive model pricing database of any tool in this comparison: 2,300+ models across 40+ providers, maintained in a centralized open-source JSON repository. The pricing data tracks multiple billing dimensions including input tokens, output tokens, cached reads, thinking tokens, audio, images, and web search. If you are using niche models or specialized providers, Portkey is most likely to have accurate pricing out of the box.

Like Helicone and Lava, Portkey is proxy-based. You swap your base URL and route requests through their gateway. Every request is logged with full cost and token breakdowns. For attribution, you pass metadata key-value pairs ("_user": "user_123", "team": "engineering", "feature": "chat") and filter analytics dashboards by any metadata dimension.

Portkey's differentiator from other observability tools is real spend enforcement. On the enterprise tier, you can set USD budget limits on virtual keys. When the limit is hit, the key expires and all requests are blocked. Budget Policies let you define per-user or per-team spend limits with periodic resets ("each user can spend $50/month"). When exceeded, Portkey returns 412 Precondition Failed and blocks the request.

This makes Portkey the strongest tool on this list for internal cost governance across large organizations. If you have multiple teams, multiple products, and need to prevent any single team from blowing the budget, Portkey's budget policies handle that.

Where it falls short: Budget enforcement requires enterprise pricing. The business plan ($99/month) gives you observability but not hard spend caps. No billing, no customer-facing cost visibility, no margin analysis.

Pricing: Free tier with limited features. Pro plan at $9 per 100K additional logs. Business at $99/month. Enterprise (required for budget enforcement) is custom.

Keywords AI

Keywords AI (YC W24, now rebranded as Respan) is a unified LLM API with built-in monitoring. Like the other proxy tools, you swap your base URL and get access to 250+ models through a single endpoint with automatic fallbacks and load balancing.

Cost tracking comes standard with every request. You tag requests with a customer_identifier parameter to attribute costs to individual end users. Analytics dashboards show per-user cost, token usage, and activity. The platform also includes a prompt playground, dataset collection, and model routing.

Good fit for early-stage teams

Keywords AI is a practical choice for startups that want a unified LLM API with basic cost visibility included. The monitoring is a feature of the API product, not the primary product itself.

Where it falls short: The product has been through multiple pivots and the rebrand to Respan makes the current trajectory harder to assess. Documentation is less comprehensive than Helicone or Portkey. No evidence of budget enforcement or spend caps. No billing capabilities for end users.

Pricing: Pricing has changed with the rebrand. Previously offered a free tier with 2,000 logs/month and paid plans starting at $9/seat/month.

How to Choose

The right tool depends on what you need to do with your cost data.

If you need to understand and optimize your AI costs, the pure observability tools are strong. Helicone gives you the best combination of cost dashboards, caching (which actively reduces costs, not just tracks them), and open-source flexibility. Portkey has the widest model coverage and the strongest enterprise governance. LangSmith is the only option for trace-level agent cost debugging.

If you need to control internal spending, the field narrows. Portkey offers hard budget enforcement on the enterprise tier with per-team and per-key dollar caps. Lava gives you spend keys with per-key limits and model restrictions, available on every plan. Helicone has rate limiting but not dollar-denominated caps.

If you need to bill your customers for AI usage, Lava is the only tool where cost tracking connects to billing. The margin dashboard (revenue minus AI costs) only exists when tracking and billing live in the same system. Every other tool on this list gives you half the picture: what you are spending. Lava gives you the other half: what you are earning. For AI platforms that resell AI capabilities, that is the number that actually matters.

If caching would meaningfully reduce your costs, look at Helicone first. If you have repetitive prompts (customer support, FAQ generation, standardized analysis), caching can cut costs dramatically. No other tool offers this.

These tools are not always mutually exclusive. Some teams run Lava for billing and spend controls, and layer Helicone on top for caching and deeper observability. The key is understanding which problem you are solving first. For more on the hidden costs of running AI in production, we wrote a full breakdown of what most teams miss.

How Lava Helps

Most cost tracking tools answer one question: "How much am I spending on AI?" That is useful, but it is half the picture. If you are building a product that charges users for AI, the question that matters is: "Am I making money?"

Lava Gateway gives you the spending side for free. Route requests to 600+ models across 30+ AI providers, and every request is logged with provider, model, token breakdown, and cost. No per-request fees. Spend keys let you set per-key budgets and model restrictions for your internal agents and services.

Lava Monetize gives you the revenue side. Your end users fund prepaid wallets, you set your markup, and Lava handles metering, deductions, and checkout. The margin dashboard shows revenue minus AI costs, so you always know whether your AI product is profitable, per customer and in aggregate.

Payments is historically difficult, detail-oriented work. It is not just about building it once. It is about maintaining it forever: reconciling ledgers, handling payment failures, managing refunds, staying PCI compliant, and adapting to new edge cases every month. That ongoing burden is best left to a company that specializes in payments, not bolted onto your engineering team's backlog. You build the product. Lava handles the money.