AI Chat API — LLM Proxy

About this tool

A lightweight LLM proxy that routes your message to the best model. Default is GPT-4o-mini via OpenAI. Switch to Claude Haiku (Anthropic) or Gemini Flash (Google) with a single field. Supports optional system prompts, max_tokens, and temperature control. Returns the response text plus token usage.

Quick Start

curl -X POST https://api.iteratools.com/ai/chat \
  -H "Authorization: Bearer YOUR_KEY" \
  -H "Content-Type: application/json" \
  -d '{"message": "What is the capital of France?", "model": "gpt-4o-mini"}'

Response

{
  "ok": true,
  "response": "The capital of France is Paris.",
  "model": "gpt-4o-mini",
  "provider": "openai",
  "tokens": {
    "input": 14,
    "output": 9,
    "total": 23
  }
}

Request Parameters

message (required) — User message to send to the LLM
model — Model to use (default: gpt-4o-mini). See supported models below.
system — Optional system prompt for role/context
max_tokens — Max tokens in response (default: 512, max: 2048)
temperature — Sampling temperature 0–2 (default: 0.7)

Supported Models

gpt-4o-mini — OpenAI GPT-4o mini (default, fast, cheap)
gpt-4o — OpenAI GPT-4o (more capable)
gpt-4-turbo — OpenAI GPT-4 Turbo
claude-haiku — Anthropic Claude 3 Haiku (fast)
claude-sonnet — Anthropic Claude 3.5 Sonnet
gemini-flash — Google Gemini 1.5 Flash
gemini-pro — Google Gemini 1.5 Pro

With System Prompt

curl -X POST https://api.iteratools.com/ai/chat \
  -H "Authorization: Bearer YOUR_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "message": "Explain recursion in one sentence",
    "model": "gpt-4o-mini",
    "system": "You are a concise technical teacher.",
    "max_tokens": 100,
    "temperature": 0.5
  }'

Pricing

$0.005 per message via x402 micropayment on Base (USDC). Max tokens: 2048. All models same price.

Full Documentation Browse All Tools