AI summarized from verified sources
OpenAI Launches WebSocket Mode in Responses API for Faster Agents
Cuts agent latency 40%, enabling real-time business tools easily.
SOURCE CHECK
2 sources
Sources
Key Points
- 1Caches state to minimize API overhead.
- 2Up to 4K tokens/sec with GPT-5.3-Codex-Spark.
- 3Keeps familiar API with previous_response_id.
- 439% faster in alpha tests.
OpenAI introduced persistent WebSocket connections in Responses API, caching conversation state to cut redundant processing in tool calls. Agent workflows speed up by up to 40%, letting devs build faster apps. Proven in alpha with Vercel and Cursor.
What changed
OpenAI introduced persistent WebSocket connections in Responses API, caching conversation state to cut redundant processing in tool calls. Agent workflows speed up by up to 40%, letting devs build faster apps. Proven in alpha with Vercel and Cursor.
Why it matters
Cuts agent latency 40%, enabling real-time business tools easily.
What to watch
Cuts agent latency 40%, enabling real-time business tools easily. Key checks: Caches state to minimize API overhead. / Up to 4K tokens/sec with GPT-5.3-Codex-Spark. / Keeps familiar API with previous_response_id..