SDK Referencev2

Complete reference for the CostLens SDK - Save 20-40% on AI costs with smart routing.

Cloud Mode Setup

Install
```
npm install costlens openai
```

Set environment variables

# .env
COSTLENS_API_KEY=cl_your_api_key_here
OPENAI_API_KEY=sk-...
ANTHROPIC_API_KEY=sk-ant-...

Make your first request

import OpenAI from 'openai';
import CostLens from 'costlens';

const openai = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });
const costlens = new CostLens({ apiKey: process.env.COSTLENS_API_KEY, enableCache: true });
const tracked = costlens.wrapOpenAI(openai);

const res = await tracked.chat.completions.create({
  model: 'gpt-4o-mini',
  messages: [{ role: 'user', content: 'Hello' }]
});

Installation

npm install costlens

Instant Mode

Works without API key setup. Provides cost optimization and smart routing for development environments.

Basic Usage

import { CostLens } from 'costlens';
import OpenAI from 'openai';

const costlens = new CostLens();
const openai = new OpenAI({ apiKey: 'your-openai-key' });
const ai = costlens.wrapOpenAI(openai);

const response = await ai.chat.completions.create({
  model: 'gpt-4',
  messages: [{ role: 'user', content: 'What is 2+2?' }]
});

// Check potential savings
const savings = await costlens.calculateSavings('gpt-4', [
  { role: 'user', content: 'What is 2+2?' }
]);
console.log(`Potential savings: ${savings.savingsPercentage}% with ${savings.recommendedModel}`);

Features

• Smart model routing (GPT-4 → GPT-3.5 for simple tasks)
• Cost calculations and savings estimates
• Works in any environment
• Zero configuration required
• No cloud tracking (upgrade for analytics)

Constructor

new CostLens(config?: CostLensConfig)

Parameters

Name	Type	Required	Description
`apiKey`	string	Yes	Your CostLens API key
`autoOptimize`	boolean	No	Cost tracking and analytics (feature in development)
`smartRouting`	boolean	No	Route to cheapest model (20x savings)
`enableCache`	boolean	No	Cache responses (savings on repeats)
`costLimit`	number	No	Max cost per request (prevents overruns)
`autoFallback`	boolean	No	Auto-fallback on rate limits
`maxRetries`	number	No	Max retry attempts (default: 3)
`baseUrl`	string	No	Custom base URL (default: https://api.costlens.dev)
`routingPolicy`	function	No	Custom routing decisions
`qualityValidator`	function	No	Custom quality scoring
`requestId`	string	No	Request tracking ID
`correlationId`	string	No	Correlation tracking ID

Example

const costlens = new CostLens({
  apiKey: 'cl_your_api_key_here',
  autoOptimize: true,
  smartRouting: true,
  enableCache: true,
  costLimit: 0.10,
  autoFallback: true,    // Auto-retry on failures
  maxRetries: 3,         // Retry up to 3 times
  
  // New SDK Features
  routingPolicy: (requestedModel, messages) => {
    // Custom routing logic
    if (messages.length > 10) return 'gpt-4o-mini';
    return requestedModel;
  },
  qualityValidator: (responseText, messagesJson) => {
    // Custom quality scoring (1-5)
    return responseText.length > 100 ? 5 : 3;
  },
  requestId: 'req_' + Date.now(),
  correlationId: 'session_abc123'
});

Provider Wrappers

Wrap your provider clients so CostLens can route, cache and track usage automatically.

OpenAI

import OpenAI from 'openai';
import CostLens from 'costlens';

const openai = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });
const costlens = new CostLens({ apiKey: process.env.COSTLENS_API_KEY, smartRouting: true });
const tracked = costlens.wrapOpenAI(openai);

const res = await tracked.chat.completions.create({
  model: 'gpt-4o-mini',
  messages: [{ role: 'user', content: 'Hello' }]
});

Anthropic

import Anthropic from '@anthropic-ai/sdk';
import CostLens from 'costlens';

const anthropic = new Anthropic({ apiKey: process.env.ANTHROPIC_API_KEY });
const costlens = new CostLens({ apiKey: process.env.COSTLENS_API_KEY, smartRouting: true });
const trackedClaude = costlens.wrapAnthropic(anthropic);

const res = await trackedClaude.messages.create({
  model: 'claude-3-haiku',
  messages: [{ role: 'user', content: 'Hello' }]
});

Error handling

Return a helpful message and optionally record the failure for visibility.

try {
  const start = Date.now();
  const result = await tracked.chat.completions.create(params);
  await costlens.trackOpenAI(params, result, Date.now() - start, 'prompt-42');
  return result;
} catch (err) {
  await costlens.trackError('openai', params.model as string, JSON.stringify(params.messages), err as Error, 0);
  throw err; // surface to caller
}

Methods

trackOpenAI()

Track an OpenAI API call.

trackOpenAI(
  params: OpenAI.Chat.ChatCompletionCreateParams,
  result: OpenAI.Chat.ChatCompletion,
  latency: number,
  promptId?: string
): Promise<void>

Parameters

params - The parameters passed to OpenAI
result - The response from OpenAI
latency - Time taken in milliseconds
promptId - Optional tag to group related prompts

Example

const start = Date.now();
const result = await openai.chat.completions.create(params);
await costlens.trackOpenAI(
  params, 
  result, 
  Date.now() - start,
  'my-prompt-v1' // optional
);

trackAnthropic()

Track an Anthropic (Claude) API call.

trackAnthropic(
  params: Anthropic.MessageCreateParams,
  result: Anthropic.Message,
  latency: number,
  promptId?: string
): Promise<void>

Parameters

params - The parameters passed to Anthropic
result - The response from Anthropic
latency - Time taken in milliseconds
promptId - Optional tag to group related prompts

Example

const start = Date.now();
const result = await anthropic.messages.create(params);
await costlens.trackAnthropic(
  params,
  result,
  Date.now() - start,
  'my-prompt-v1' // optional
);

trackError()

Track a failed API call.

trackError(
  provider: string,
  model: string,
  input: string,
  error: Error,
  latency: number
): Promise<void>

Parameters

provider - The provider (openai, anthropic)
model - The model that was attempted
input - The input that was sent
error - The error object
latency - Time taken before failure

Example

try {
  const result = await openai.chat.completions.create(params);
  await costlens.trackOpenAI(params, result, latency);
} catch (error) {
  await costlens.trackError(
    'openai',
    params.model,
    JSON.stringify(params.messages),
    error,
    latency
  );
  throw error;
}

trackBatch()

Process multiple AI requests in a single call for 3-5x better performance.

trackBatch(
  calls: Array<{ provider: string; model: string; tokens: number; latency: number }>
): Promise<void>

Parameters

calls - Array of request data to process in batch
provider - The AI provider (openai, anthropic, etc.)
model - The model used
tokens - Number of tokens used
latency - Request latency in milliseconds

Performance Benefits

3-5x faster than individual requests
Reduced HTTP overhead with batching
90% more reliable with fewer failure points
Automatic batching with queue management

Example

// Process multiple requests efficiently
const requests = [
  { provider: 'openai', model: 'gpt-4', tokens: 150, latency: 1200 },
  { provider: 'anthropic', model: 'claude-3', tokens: 200, latency: 1000 },
  { provider: 'openai', model: 'gpt-3.5-turbo', tokens: 100, latency: 800 }
];

// Single batch call - 3-5x faster than individual requests
await costlens.trackBatch(requests);

// Automatic queue management for optimal performance
// SDK automatically batches requests when possible

getCostAnalytics()

Get real-time performance metrics and savings data.

getCostAnalytics(): {
  cacheHitRate: number;    // Cache hit rate (0-1)
  totalSavings: number;    // Total money saved
  averageLatency: number;  // Average request latency
  errorRate: number;       // Error rate (0-1)
}

Example

const analytics = costlens.getCostAnalytics();
console.log('Cache Hit Rate:', analytics.cacheHitRate * 100 + '%');
console.log('Total Savings: $' + analytics.totalSavings);
console.log('Average Latency:', analytics.averageLatency + 'ms');
console.log('Error Rate:', analytics.errorRate * 100 + '%');

calculateSavings()

Calculate potential savings before making a request.

calculateSavings(
  requestedModel: string,
  messages: any[]
): Promise<{
  currentCost: number;
  optimizedCost: number;
  savings: number;
  savingsPercentage: number;
  recommendedModel: string;
}>

Example

const savings = await costlens.calculateSavings('gpt-4', messages);
console.log('Current Cost: $' + savings.currentCost);
console.log('Optimized Cost: $' + savings.optimizedCost);
console.log('Savings: $' + savings.savings);
console.log('Savings %: ' + savings.savingsPercentage + '%');
console.log('Recommended Model: ' + savings.recommendedModel);

OpenAI & Anthropic Support

CostLens currently supports OpenAI and Anthropic APIs with smart routing between models.

// OpenAI routing: GPT-4 → GPT-3.5 for simple tasks
const openaiResult = await costlens.openai.chat.completions.create({
  model: 'gpt-4',
  messages: [{ role: 'user', content: 'Simple task' }]
});
// Automatically routed to GPT-3.5-turbo (20x cheaper)

// Anthropic routing: Claude Opus → Haiku for simple tasks  
const anthropicResult = await costlens.anthropic.messages.create({
  model: 'claude-3-opus-20240229',
  messages: [{ role: 'user', content: 'Simple task' }]
});
// Automatically routed to Claude Haiku (60x cheaper)

Advanced Analytics & Forecasting

CostLens provides ML-powered cost forecasting and routing analytics through the dashboard.

🔮 Predictive Analytics

• ML-based cost forecasting
• 7-day and 30-day predictions
• Confidence scoring (45-85%)
• Trend analysis & seasonality

🧠 Smart Routing

• Context-aware model selection
• Quality vs cost optimization
• Real-time routing decisions
• Provider performance tracking

// All analytics available through dashboard API
const stats = await fetch('/api/dashboard/stats', {
  headers: { 'Authorization': 'Bearer ' + apiKey }
});

const data = await stats.json();
console.log('Cost forecast:', data.costForecast);
console.log('Routing decisions:', data.routingDecisions);
console.log('Provider stats:', data.providerStats);

30-day forecast and optimization tips.

const forecast = await costlens.getCostForecast({ windowDays: 30 });
console.log(forecast.projectedMonthlyCost, forecast.trend, forecast.confidence);

const alerts = await costlens.checkCostAlerts();
console.log(alerts);

const recs = await costlens.getOptimizationRecommendations();
console.log(recs);

Advanced Features

Custom Routing Policy

Override default routing decisions with custom logic based on request context.

const costlens = new CostLens({
  apiKey: process.env.COSTLENS_API_KEY,
  routingPolicy: (requestedModel, messages) => {
    // Route complex queries to better models
    const complexity = messages.reduce((acc, msg) => acc + msg.content.length, 0);
    
    if (complexity > 1000) {
      return 'gpt-4o'; // Use premium model for complex tasks
    }
    
    if (requestedModel === 'gpt-4' && complexity < 100) {
      return 'gpt-4o-mini'; // Downgrade simple tasks
    }
    
    return requestedModel; // Keep original choice
  }
});

Custom Quality Validation

Implement custom quality scoring to improve routing decisions over time.

const costlens = new CostLens({
  apiKey: process.env.COSTLENS_API_KEY,
  qualityValidator: (responseText, messagesJson) => {
    const messages = JSON.parse(messagesJson);
    
    // Score based on response completeness
    let score = 3; // baseline
    
    if (responseText.length > 200) score += 1;
    if (responseText.includes('```')) score += 1; // code examples
    if (messages.some(m => m.content.includes('?')) && 
        responseText.includes('?')) score -= 1; // answered with question
    
    return Math.max(1, Math.min(5, score)); // clamp 1-5
  }
});

Request Correlation

Track related requests across your application with correlation IDs.

// Track user session
const sessionId = 'session_' + userId;
const requestId = 'req_' + Date.now();

const costlens = new CostLens({
  apiKey: process.env.COSTLENS_API_KEY,
  requestId: requestId,
  correlationId: sessionId
});

const tracked = costlens.wrapOpenAI(openai);

// All requests will be tagged with these IDs
await tracked.chat.completions.create({
  model: 'gpt-4o-mini',
  messages: [{ role: 'user', content: 'Hello' }]
});

// Get analytics using SDK method
const analytics = costlens.getCostAnalytics();
console.log('Cache Hit Rate:', analytics.cacheHitRate * 100 + '%');
console.log('Total Savings: $' + analytics.totalSavings);

Money-Saving Features

Redis Caching

Automatically cache responses to save money on repeated requests. Achieves strong hit rates in production.

const costlens = new CostLens({
  apiKey: process.env.COSTLENS_API_KEY,
  enableCache: true, // Enable Redis caching
});

const tracked = costlens.wrapOpenAI(openai);

// First call - cache miss, costs $0.05
await tracked.chat.completions.create({
  model: 'gpt-4',
  messages: [{ role: 'user', content: 'What is 2+2?' }],
});

// Second call - cache hit, costs $0.00!
await tracked.chat.completions.create({
  model: 'gpt-4',
  messages: [{ role: 'user', content: 'What is 2+2?' }],
});

Quality Monitoring

Smart routing automatically disables if response quality drops below 3.5/5 stars.

// SDK checks quality status before routing
const tracked = costlens.wrapOpenAI(openai);

// If quality is good: GPT-4 → GPT-3.5 (saves money)
// If quality dropped: Uses GPT-4 (protects quality)
await tracked.chat.completions.create({
  model: 'gpt-4',
  messages: [{ role: 'user', content: 'Complex task...' }],
});

// Note: Quality feedback is handled automatically by the SDK
// The SDK tracks routing decisions and learns from them internally

AI-Powered Optimization

Automatically compress prompts by 30-50% while preserving meaning.

const costlens = new CostLens({
  apiKey: process.env.COSTLENS_API_KEY,
  autoOptimize: true, // Enable AI compression
});

const tracked = costlens.wrapOpenAI(openai);

// Original: 200 tokens
// Optimized: 100 tokens (50% reduction)
// Savings: $0.009 per request
await tracked.chat.completions.create({
  model: 'gpt-4',
  messages: [{
    role: 'user',
    content: 'Please kindly help me understand what the weather will be like tomorrow in San Francisco, California, USA'
  }],
});
// Compressed to: "Weather forecast for San Francisco tomorrow?"

Real-Time Savings

Track exactly how much money you're saving with baseline cost comparison.

// Calculate potential savings using SDK method
const savings = await costlens.calculateSavings('gpt-4', messages);

console.log(`Current Cost: $${savings.currentCost}`);
console.log(`Optimized Cost: $${savings.optimizedCost}`);
console.log(`Savings: $${savings.savings} (${savings.savingsPercentage.toFixed(1)}%)`);
console.log(`Recommended Model: ${savings.recommendedModel}`);

// Example output:
// Current Cost: $0.15
// Optimized Cost: $0.03
// Savings: $0.12 (80.0%)
// Recommended Model: gpt-3.5-turbo

Types

CostLensConfig

interface CostLensConfig {
  apiKey: string;
  baseUrl?: string;
}

Environment Variables

# .env
COSTLENS_API_KEY=cl_your_api_key_here
OPENAI_API_KEY=sk-your_openai_key
ANTHROPIC_API_KEY=sk-ant-your_anthropic_key

Troubleshooting

401 Unauthorized: Check COSTLENS_API_KEY and header formatting.
Missing data: Ensure server-side usage for caching/optimization features.
Model mismatch: Enable enforcement and verify allowed models.

SDK Referencev2

Complete reference for the CostLens SDK - Save 20-40% on AI costs with smart routing.

Cloud Mode Setup

Install
```
npm install costlens openai
```

Set environment variables

# .env
COSTLENS_API_KEY=cl_your_api_key_here
OPENAI_API_KEY=sk-...
ANTHROPIC_API_KEY=sk-ant-...

Make your first request

import OpenAI from 'openai';
import CostLens from 'costlens';

const openai = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });
const costlens = new CostLens({ apiKey: process.env.COSTLENS_API_KEY, enableCache: true });
const tracked = costlens.wrapOpenAI(openai);

const res = await tracked.chat.completions.create({
  model: 'gpt-4o-mini',
  messages: [{ role: 'user', content: 'Hello' }]
});

Installation

npm install costlens

Instant Mode

Works without API key setup. Provides cost optimization and smart routing for development environments.

Basic Usage

import { CostLens } from 'costlens';
import OpenAI from 'openai';

const costlens = new CostLens();
const openai = new OpenAI({ apiKey: 'your-openai-key' });
const ai = costlens.wrapOpenAI(openai);

const response = await ai.chat.completions.create({
  model: 'gpt-4',
  messages: [{ role: 'user', content: 'What is 2+2?' }]
});

// Check potential savings
const savings = await costlens.calculateSavings('gpt-4', [
  { role: 'user', content: 'What is 2+2?' }
]);
console.log(`Potential savings: ${savings.savingsPercentage}% with ${savings.recommendedModel}`);

Features

• Smart model routing (GPT-4 → GPT-3.5 for simple tasks)
• Cost calculations and savings estimates
• Works in any environment
• Zero configuration required
• No cloud tracking (upgrade for analytics)

Constructor

new CostLens(config?: CostLensConfig)

Parameters

Name	Type	Required	Description
`apiKey`	string	Yes	Your CostLens API key
`autoOptimize`	boolean	No	Cost tracking and analytics (feature in development)
`smartRouting`	boolean	No	Route to cheapest model (20x savings)
`enableCache`	boolean	No	Cache responses (savings on repeats)
`costLimit`	number	No	Max cost per request (prevents overruns)
`autoFallback`	boolean	No	Auto-fallback on rate limits
`maxRetries`	number	No	Max retry attempts (default: 3)
`baseUrl`	string	No	Custom base URL (default: https://api.costlens.dev)
`routingPolicy`	function	No	Custom routing decisions
`qualityValidator`	function	No	Custom quality scoring
`requestId`	string	No	Request tracking ID
`correlationId`	string	No	Correlation tracking ID

Example

const costlens = new CostLens({
  apiKey: 'cl_your_api_key_here',
  autoOptimize: true,
  smartRouting: true,
  enableCache: true,
  costLimit: 0.10,
  autoFallback: true,    // Auto-retry on failures
  maxRetries: 3,         // Retry up to 3 times
  
  // New SDK Features
  routingPolicy: (requestedModel, messages) => {
    // Custom routing logic
    if (messages.length > 10) return 'gpt-4o-mini';
    return requestedModel;
  },
  qualityValidator: (responseText, messagesJson) => {
    // Custom quality scoring (1-5)
    return responseText.length > 100 ? 5 : 3;
  },
  requestId: 'req_' + Date.now(),
  correlationId: 'session_abc123'
});

Provider Wrappers

Wrap your provider clients so CostLens can route, cache and track usage automatically.

OpenAI

import OpenAI from 'openai';
import CostLens from 'costlens';

const openai = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });
const costlens = new CostLens({ apiKey: process.env.COSTLENS_API_KEY, smartRouting: true });
const tracked = costlens.wrapOpenAI(openai);

const res = await tracked.chat.completions.create({
  model: 'gpt-4o-mini',
  messages: [{ role: 'user', content: 'Hello' }]
});

Anthropic

import Anthropic from '@anthropic-ai/sdk';
import CostLens from 'costlens';

const anthropic = new Anthropic({ apiKey: process.env.ANTHROPIC_API_KEY });
const costlens = new CostLens({ apiKey: process.env.COSTLENS_API_KEY, smartRouting: true });
const trackedClaude = costlens.wrapAnthropic(anthropic);

const res = await trackedClaude.messages.create({
  model: 'claude-3-haiku',
  messages: [{ role: 'user', content: 'Hello' }]
});

Error handling

Return a helpful message and optionally record the failure for visibility.

try {
  const start = Date.now();
  const result = await tracked.chat.completions.create(params);
  await costlens.trackOpenAI(params, result, Date.now() - start, 'prompt-42');
  return result;
} catch (err) {
  await costlens.trackError('openai', params.model as string, JSON.stringify(params.messages), err as Error, 0);
  throw err; // surface to caller
}

Methods

trackOpenAI()

Track an OpenAI API call.

trackOpenAI(
  params: OpenAI.Chat.ChatCompletionCreateParams,
  result: OpenAI.Chat.ChatCompletion,
  latency: number,
  promptId?: string
): Promise<void>

Parameters

params - The parameters passed to OpenAI
result - The response from OpenAI
latency - Time taken in milliseconds
promptId - Optional tag to group related prompts

Example

const start = Date.now();
const result = await openai.chat.completions.create(params);
await costlens.trackOpenAI(
  params, 
  result, 
  Date.now() - start,
  'my-prompt-v1' // optional
);

trackAnthropic()

Track an Anthropic (Claude) API call.

trackAnthropic(
  params: Anthropic.MessageCreateParams,
  result: Anthropic.Message,
  latency: number,
  promptId?: string
): Promise<void>

Parameters

params - The parameters passed to Anthropic
result - The response from Anthropic
latency - Time taken in milliseconds
promptId - Optional tag to group related prompts

Example

const start = Date.now();
const result = await anthropic.messages.create(params);
await costlens.trackAnthropic(
  params,
  result,
  Date.now() - start,
  'my-prompt-v1' // optional
);

trackError()

Track a failed API call.

trackError(
  provider: string,
  model: string,
  input: string,
  error: Error,
  latency: number
): Promise<void>

Parameters

provider - The provider (openai, anthropic)
model - The model that was attempted
input - The input that was sent
error - The error object
latency - Time taken before failure

Example

try {
  const result = await openai.chat.completions.create(params);
  await costlens.trackOpenAI(params, result, latency);
} catch (error) {
  await costlens.trackError(
    'openai',
    params.model,
    JSON.stringify(params.messages),
    error,
    latency
  );
  throw error;
}

trackBatch()

Process multiple AI requests in a single call for 3-5x better performance.

trackBatch(
  calls: Array<{ provider: string; model: string; tokens: number; latency: number }>
): Promise<void>

Parameters

calls - Array of request data to process in batch
provider - The AI provider (openai, anthropic, etc.)
model - The model used
tokens - Number of tokens used
latency - Request latency in milliseconds

Performance Benefits

3-5x faster than individual requests
Reduced HTTP overhead with batching
90% more reliable with fewer failure points
Automatic batching with queue management

Example

// Process multiple requests efficiently
const requests = [
  { provider: 'openai', model: 'gpt-4', tokens: 150, latency: 1200 },
  { provider: 'anthropic', model: 'claude-3', tokens: 200, latency: 1000 },
  { provider: 'openai', model: 'gpt-3.5-turbo', tokens: 100, latency: 800 }
];

// Single batch call - 3-5x faster than individual requests
await costlens.trackBatch(requests);

// Automatic queue management for optimal performance
// SDK automatically batches requests when possible

getCostAnalytics()

Get real-time performance metrics and savings data.

getCostAnalytics(): {
  cacheHitRate: number;    // Cache hit rate (0-1)
  totalSavings: number;    // Total money saved
  averageLatency: number;  // Average request latency
  errorRate: number;       // Error rate (0-1)
}

Example

const analytics = costlens.getCostAnalytics();
console.log('Cache Hit Rate:', analytics.cacheHitRate * 100 + '%');
console.log('Total Savings: $' + analytics.totalSavings);
console.log('Average Latency:', analytics.averageLatency + 'ms');
console.log('Error Rate:', analytics.errorRate * 100 + '%');

calculateSavings()

Calculate potential savings before making a request.

calculateSavings(
  requestedModel: string,
  messages: any[]
): Promise<{
  currentCost: number;
  optimizedCost: number;
  savings: number;
  savingsPercentage: number;
  recommendedModel: string;
}>

Example

const savings = await costlens.calculateSavings('gpt-4', messages);
console.log('Current Cost: $' + savings.currentCost);
console.log('Optimized Cost: $' + savings.optimizedCost);
console.log('Savings: $' + savings.savings);
console.log('Savings %: ' + savings.savingsPercentage + '%');
console.log('Recommended Model: ' + savings.recommendedModel);

OpenAI & Anthropic Support

CostLens currently supports OpenAI and Anthropic APIs with smart routing between models.

// OpenAI routing: GPT-4 → GPT-3.5 for simple tasks
const openaiResult = await costlens.openai.chat.completions.create({
  model: 'gpt-4',
  messages: [{ role: 'user', content: 'Simple task' }]
});
// Automatically routed to GPT-3.5-turbo (20x cheaper)

// Anthropic routing: Claude Opus → Haiku for simple tasks  
const anthropicResult = await costlens.anthropic.messages.create({
  model: 'claude-3-opus-20240229',
  messages: [{ role: 'user', content: 'Simple task' }]
});
// Automatically routed to Claude Haiku (60x cheaper)

Advanced Analytics & Forecasting

CostLens provides ML-powered cost forecasting and routing analytics through the dashboard.

🔮 Predictive Analytics

• ML-based cost forecasting
• 7-day and 30-day predictions
• Confidence scoring (45-85%)
• Trend analysis & seasonality

🧠 Smart Routing

• Context-aware model selection
• Quality vs cost optimization
• Real-time routing decisions
• Provider performance tracking

// All analytics available through dashboard API
const stats = await fetch('/api/dashboard/stats', {
  headers: { 'Authorization': 'Bearer ' + apiKey }
});

const data = await stats.json();
console.log('Cost forecast:', data.costForecast);
console.log('Routing decisions:', data.routingDecisions);
console.log('Provider stats:', data.providerStats);

30-day forecast and optimization tips.

const forecast = await costlens.getCostForecast({ windowDays: 30 });
console.log(forecast.projectedMonthlyCost, forecast.trend, forecast.confidence);

const alerts = await costlens.checkCostAlerts();
console.log(alerts);

const recs = await costlens.getOptimizationRecommendations();
console.log(recs);

Advanced Features

Custom Routing Policy

Override default routing decisions with custom logic based on request context.

const costlens = new CostLens({
  apiKey: process.env.COSTLENS_API_KEY,
  routingPolicy: (requestedModel, messages) => {
    // Route complex queries to better models
    const complexity = messages.reduce((acc, msg) => acc + msg.content.length, 0);
    
    if (complexity > 1000) {
      return 'gpt-4o'; // Use premium model for complex tasks
    }
    
    if (requestedModel === 'gpt-4' && complexity < 100) {
      return 'gpt-4o-mini'; // Downgrade simple tasks
    }
    
    return requestedModel; // Keep original choice
  }
});

Custom Quality Validation

Implement custom quality scoring to improve routing decisions over time.

const costlens = new CostLens({
  apiKey: process.env.COSTLENS_API_KEY,
  qualityValidator: (responseText, messagesJson) => {
    const messages = JSON.parse(messagesJson);
    
    // Score based on response completeness
    let score = 3; // baseline
    
    if (responseText.length > 200) score += 1;
    if (responseText.includes('```')) score += 1; // code examples
    if (messages.some(m => m.content.includes('?')) && 
        responseText.includes('?')) score -= 1; // answered with question
    
    return Math.max(1, Math.min(5, score)); // clamp 1-5
  }
});

Request Correlation

Track related requests across your application with correlation IDs.

// Track user session
const sessionId = 'session_' + userId;
const requestId = 'req_' + Date.now();

const costlens = new CostLens({
  apiKey: process.env.COSTLENS_API_KEY,
  requestId: requestId,
  correlationId: sessionId
});

const tracked = costlens.wrapOpenAI(openai);

// All requests will be tagged with these IDs
await tracked.chat.completions.create({
  model: 'gpt-4o-mini',
  messages: [{ role: 'user', content: 'Hello' }]
});

// Get analytics using SDK method
const analytics = costlens.getCostAnalytics();
console.log('Cache Hit Rate:', analytics.cacheHitRate * 100 + '%');
console.log('Total Savings: $' + analytics.totalSavings);

Money-Saving Features

Redis Caching

Automatically cache responses to save money on repeated requests. Achieves strong hit rates in production.

const costlens = new CostLens({
  apiKey: process.env.COSTLENS_API_KEY,
  enableCache: true, // Enable Redis caching
});

const tracked = costlens.wrapOpenAI(openai);

// First call - cache miss, costs $0.05
await tracked.chat.completions.create({
  model: 'gpt-4',
  messages: [{ role: 'user', content: 'What is 2+2?' }],
});

// Second call - cache hit, costs $0.00!
await tracked.chat.completions.create({
  model: 'gpt-4',
  messages: [{ role: 'user', content: 'What is 2+2?' }],
});

Quality Monitoring

Smart routing automatically disables if response quality drops below 3.5/5 stars.

// SDK checks quality status before routing
const tracked = costlens.wrapOpenAI(openai);

// If quality is good: GPT-4 → GPT-3.5 (saves money)
// If quality dropped: Uses GPT-4 (protects quality)
await tracked.chat.completions.create({
  model: 'gpt-4',
  messages: [{ role: 'user', content: 'Complex task...' }],
});

// Note: Quality feedback is handled automatically by the SDK
// The SDK tracks routing decisions and learns from them internally

AI-Powered Optimization

Automatically compress prompts by 30-50% while preserving meaning.

const costlens = new CostLens({
  apiKey: process.env.COSTLENS_API_KEY,
  autoOptimize: true, // Enable AI compression
});

const tracked = costlens.wrapOpenAI(openai);

// Original: 200 tokens
// Optimized: 100 tokens (50% reduction)
// Savings: $0.009 per request
await tracked.chat.completions.create({
  model: 'gpt-4',
  messages: [{
    role: 'user',
    content: 'Please kindly help me understand what the weather will be like tomorrow in San Francisco, California, USA'
  }],
});
// Compressed to: "Weather forecast for San Francisco tomorrow?"

Real-Time Savings

Track exactly how much money you're saving with baseline cost comparison.

// Calculate potential savings using SDK method
const savings = await costlens.calculateSavings('gpt-4', messages);

console.log(`Current Cost: $${savings.currentCost}`);
console.log(`Optimized Cost: $${savings.optimizedCost}`);
console.log(`Savings: $${savings.savings} (${savings.savingsPercentage.toFixed(1)}%)`);
console.log(`Recommended Model: ${savings.recommendedModel}`);

// Example output:
// Current Cost: $0.15
// Optimized Cost: $0.03
// Savings: $0.12 (80.0%)
// Recommended Model: gpt-3.5-turbo

Types

CostLensConfig

interface CostLensConfig {
  apiKey: string;
  baseUrl?: string;
}

Environment Variables

# .env
COSTLENS_API_KEY=cl_your_api_key_here
OPENAI_API_KEY=sk-your_openai_key
ANTHROPIC_API_KEY=sk-ant-your_anthropic_key

Troubleshooting

401 Unauthorized: Check COSTLENS_API_KEY and header formatting.
Missing data: Ensure server-side usage for caching/optimization features.
Model mismatch: Enable enforcement and verify allowed models.