What is Costrace?
Costrace is an LLM observability SDK that automatically tracks cost, token usage, and latency for every API call you make to OpenAI, Anthropic, or Google Gemini. No code changes required, just callinit() and use your LLM SDKs as normal.
Key Features
Zero Configuratio.mn
One line to initialize. Works by monkey-patching your existing LLM client
libraries.
Real-time Cost Tracking
See exactly how much each API call costs, broken down by model and provider.
Latency Monitoring
Track response times to identify slow endpoints and optimize performance.
Multi-Provider Support
OpenAI, Anthropic, and Gemini all tracked automatically.
How It Works
Costrace works by patching the client libraries for OpenAI, Anthropic, and Gemini. When you make an API call, Costrace:- Records the start time
- Lets your call proceed normally
- Captures token usage and response time
- Calculates the cost based on current pricing
- Sends a trace to the Costrace backend (fire-and-forget, non-blocking)
Supported Providers
OpenAI
OpenAI
- GPT-5 family (5.2, 5, 5-mini, 5-nano) - GPT-4 family (4o, 4o-mini, 4.1, 4-turbo, 4) - GPT-3.5-turbo - o3, o4-mini
Anthropic
Anthropic
- Claude Opus (4-6, 4-1, 4) - Claude Sonnet (4-6, 4-5, 4, 3-7, 3.5) - Claude Haiku (4-5, 3)
Google Gemini
Google Gemini
- Gemini 2.0 (flash, flash-lite) - Gemini 1.5 (pro, flash, flash-8b)