OpenAI21:05Feature UpdatesOfficial Blog
OpenAI Launches WebSocket Mode in Responses API for Faster Agents
Cuts agent latency 40%, enabling real-time business tools easily.
Key Points
- 1Caches state to minimize API overhead.
- 2Up to 4K tokens/sec with GPT-5.3-Codex-Spark.
- 3Keeps familiar API with previous_response_id.
- 439% faster in alpha tests.
OpenAI introduced persistent WebSocket connections in Responses API, caching conversation state to cut redundant processing in tool calls. Agent workflows speed up by up to 40%, letting devs build faster apps. Proven in alpha with Vercel and Cursor.