A cost-efficient version of GPT Realtime - capable of responding to audio and text inputs in realtime over WebRTC, WebSocket, or SIP connections.
Specifications
Context
32K
Maximum Output
4.1K
Inputtext, audio, image
Outputtext, audio
Performance (7-day Average)
Collecting…
Collecting…
Collecting…
Pricing
Input$0.66/MTokens
Cached Input$0.07/MTokens
Output$2.64/MTokens
Input Audio$11.00/MTokens
cached input audio$0.33/MTokens
Output Audio$22.00/MTokens
Input Image$0.88/MTokens
Availability Trend (24h)
Performance Metrics (24h)
Similar Models
$0.55/$1.65/M
ctx16Kmax4Kavail—tps—
InOut
A fast, cost-effective text generation model for simple tasks and high-volume applications.
$1.10/$2.20/M
ctx16Kmax4Kavail—tps—
InOut
November 2023 snapshot of GPT-3.5 Turbo with improved instruction following and JSON mode support.
$0.55/$1.65/M
ctx16Kmax4Kavail—tps—
InOut
January 2025 snapshot of GPT-3.5 Turbo with various improvements and bug fixes.
$0.66/$2.64/M
ctx32Kmax4Kavail—tps—
InOut
A cost-efficient version of GPT Realtime - capable of responding to audio and text inputs in realtime over WebRTC, WebSocket, or SIP connections.