Introduction - costrace

What is Costrace?

Costrace is an LLM observability SDK that automatically tracks cost, token usage, and latency for every API call you make to OpenAI, Anthropic, or Google Gemini. No code changes required, just call init() and use your LLM SDKs as normal.

Key Features

Zero Configuratio.mn

One line to initialize. Works by monkey-patching your existing LLM client libraries.

Real-time Cost Tracking

See exactly how much each API call costs, broken down by model and provider.

Latency Monitoring

Track response times to identify slow endpoints and optimize performance.

Multi-Provider Support

OpenAI, Anthropic, and Gemini all tracked automatically.

How It Works

Costrace works by patching the client libraries for OpenAI, Anthropic, and Gemini. When you make an API call, Costrace:

Records the start time
Lets your call proceed normally
Captures token usage and response time
Calculates the cost based on current pricing
Sends a trace to the Costrace backend (fire-and-forget, non-blocking)

All of this happens transparently! Your code doesn’t change.

Supported Providers

OpenAI

GPT-5 family (5.2, 5, 5-mini, 5-nano) - GPT-4 family (4o, 4o-mini, 4.1, 4-turbo, 4) - GPT-3.5-turbo - o3, o4-mini

Anthropic

Claude Opus (4-6, 4-1, 4) - Claude Sonnet (4-6, 4-5, 4, 3-7, 3.5) - Claude Haiku (4-5, 3)

Google Gemini

Gemini 2.0 (flash, flash-lite) - Gemini 1.5 (pro, flash, flash-8b)

Next Steps

Quick Start

Get up and running in 2 minutes

Python SDK

Installation and usage for Python

Node.js SDK

Installation and usage for Node.js

API Reference

REST API documentation

Getting Started

SDKs

​What is Costrace?

​Key Features