XAI Router Now Supports OpenAI WebSocket Mode: Official Behavior Alignment

Posted February 24, 2026 by XAI Tech Team ‐ 3 min read

This is an engineering note for XAI Router's WebSocket support. As of 2026-02-24, XAI Router supports OpenAI WebSocket workflows for:

Responses WebSocket mode (wss://.../v1/responses)
Realtime WebSocket sessions (wss://.../v1/realtime)
Coexistence with existing HTTP APIs without changing normal HTTP behavior

OpenAI WebSocket Mode: Key Semantics

According to OpenAI's official guide, core semantics for Responses WebSocket mode are:

Keep a persistent connection to /v1/responses
Start each turn with response.create
Continue context with previous_response_id plus incremental input
Sequential execution per connection: only one in-flight response at a time (no multiplexing)
Connection lifetime limit of 60 minutes, then reconnect

How XAI Router Aligns

1) Path compatibility

XAI Router supports both path variants for easier client migration:

/v1/responses and /responses
/v1/realtime and /realtime

2) Same sequential model as OpenAI

For /v1/responses in WebSocket mode:

Multiple response.create events are allowed over one connection
But they must be sequential
Concurrent in-flight response.create events on the same connection are rejected

This matches OpenAI's documented single-connection sequential behavior.

3) Conversation-state transparency

Fields like previous_response_id, incremental input, and store=false are preserved as conversation semantics. XAI Router focuses on model mapping, ACL checks, rate limits, routing, and usage accounting around them.

Unified WebSocket Architecture

This support is implemented through a unified framework (not endpoint-specific patches):

ws_framework: session lifecycle, relay, timeout control, and error handling
openai-responses-ws adapter: turn lifecycle for response.create, response-id binding, usage finalize
openai-realtime-ws adapter: realtime event relay and session usage tracking

The legacy /v1/realtime handling has also been migrated into the same framework to reduce branching and maintenance cost.

XAI Router OpenAI WebSocket Alignment Diagram

This diagram reflects the unified WS design: preserve OpenAI behavior while converging Responses and Realtime into one session/relay framework.

Minimal Responses WebSocket Example

The following example opens a connection via XAI Router and creates one gpt-5.2 response:

from websocket import create_connection
import json
import os

ws = create_connection(
    "wss://api.xairouter.com/v1/responses",
    header=[
        f"Authorization: Bearer {os.environ['XAI_API_KEY']}",
    ],
)

ws.send(json.dumps({
    "type": "response.create",
    "model": "gpt-5.2",
    "store": False,
    "input": [
        {
            "type": "message",
            "role": "user",
            "content": [{"type": "input_text", "text": "Summarize websocket mode in one sentence."}]
        }
    ],
    "tools": []
}))

while True:
    event = json.loads(ws.recv())
    print(event.get("type"))
    if event.get("type") in ("response.completed", "response.failed", "response.incomplete"):
        break

ws.close()

Performance and Stability Notes

Without changing external behavior, the implementation includes practical optimizations:

Lightweight event-type prefilter before full JSON unmarshal on hot paths
Shared relay framework for Responses and Realtime to reduce duplicated logic
Cleaner connection-error handling with reduced log noise for expected disconnect patterns

Result: better maintainability and stable WS behavior while preserving existing HTTP behavior.

Conclusion

If your workload relies on long-lived, low-latency, multi-turn interaction, OpenAI WebSocket mode can be significantly better than rebuilding context on each HTTP request.

XAI Router's goal is straightforward: keep OpenAI semantics intact while adding production-grade control for routing, limits, policy, and accounting.

References

OpenAI WebSocket Mode: https://developers.openai.com/api/docs/guides/websocket-mode
OpenAI Realtime WebSocket guide: https://platform.openai.com/docs/guides/realtime-websocket
OpenAI Responses API reference: https://platform.openai.com/docs/api-reference/responses/create