Operator
← Back to blog
GuidesTroubleshooting

Debugging the OpenAI Codex backend in OpenClaw

Operator TeamOperator Team···9 min read

Using OpenAI's Codex as the brain behind an OpenClaw agent is one of the most common setups, because it lets you run the agent on a ChatGPT or Codex subscription you already pay for instead of metered API tokens. Codex itself is open source and ships as a CLI and app server; OpenClaw wraps that app server in a harness plugin documented in the Codex harness reference. That harness speaks a turn based protocol (turn/started, tool calls, turn/completed) rather than a plain chat completion, and most of the current bugs live in that layer rather than in your prompt.

The failures are specific enough that recognizing the signature saves a lot of time. A stalled turn, a Cloudflare HTML page masquerading as a DNS error, and a token that validates in curl but fails inside the gateway are three different problems with three different fixes. The sections below walk through each live failure mode as of the 2026.5.x releases, with the exact error strings and GitHub threads where maintainers are tracking them.

FailureWhat it actually isWhere to look
Codex stopped before confirming the turn was completeA harness regression, the completion signal never arrivedTurn never completes
403, or a DNS or rate limit error with a valid tokenA wrong backend URL hitting a Cloudflare challengeCloudflare 403
401 right after a successful loginThe login path and the request path disagree on the tokenOAuth then 401
Unknown model for a model the config acceptsRuntime support lags the config schemaUnknown model
Red error lines mid task while the result is rightNative tool statuses rendered as failuresTool statuses as failures

How the Codex backend differs

When you select openai-codex/gpt-5.4 (or another Codex model alias), OpenClaw does not send a single POST to /v1/chat/completions. It launches the Codex app server subprocess, starts a native turn, forwards your tools, and waits for Codex to emit turn/completed before releasing the session lane. The harness enforces several idle timeouts documented under plugins.entries.codex.config.appServer in the harness reference.

Idle timeoutWhat it governs
turnCompletionIdleTimeoutMs (default 60 seconds)How long OpenClaw waits after Codex accepts a turn without seeing terminal progress
postToolRawAssistantCompletionIdleTimeoutMsThe quiet window after a tool handoff while the model synthesizes its answer

That architecture is why Codex specific bugs look different from Anthropic or Gemini provider bugs. A generic API connection error from the metered OpenAI API and a Codex harness stall both show up as "the agent did not reply," but the log lines underneath are unrelated. Before you change models or rewrite system prompts, read the gateway log for Codex app server lines (codex app-server, turn/completed, rawResponseItem/completed) rather than assuming the model failed.

Turn never completes

This is the headline problem at the moment. The Codex app server starts a turn, the agent does work, and then the harness never receives the completion signal, so OpenClaw reports Codex stopped before confirming the turn was complete (issue #88312). It is a regression: it was fixed once in an earlier release and came back in 2026.5.27. A closely related open issue is Codex backed turns on Telegram repeatedly timing out waiting for turn/completed (#87744).

The same failure family shows up under other log strings depending on which watchdog fired first. You may see codex app-server turn idle timed out waiting for completion, codex app-server attempt timed out, or LLM request timed out even though partial output already reached your channel (#78756). In webchat on macOS, some builds stall immediately after turn/started with no turn/completed at all, and the gateway aborts the embedded run around 620 seconds with reason=active_work_without_progress (#82989).

Because these are harness regressions rather than misconfiguration, the practical move is version management. If turns started stalling right after an update, pin back to the last release where Codex turns completed cleanly for you, or move forward once a fix lands, and watch the issue threads rather than rewriting your config. Several related stalls (missing turn/completed after tool output, watchdog firing on a status sentence like "I'm writing the report...") were patched on main in 2026.5.16 beta and later (#77984, #79667).

If you are on a build that exposes the timeout knobs, raising appServer.turnCompletionIdleTimeoutMs or appServer.postToolRawAssistantCompletionIdleTimeoutMs in your Codex plugin config can help on legitimately slow turns (large research passes, long writes after a tool call). There is also an open request for a configurable streaming watchdog (#68596) aimed at exactly this class of stall. Until a release lands with a durable fix for #88312, treat a missing turn/completed as a harness bug, collect your OpenClaw version and a log excerpt with openclaw logs --follow, and attach them to the relevant GitHub issue rather than chasing prompt changes.

Cloudflare 403

A separate class of Codex failure looks like auth or network trouble but is actually a wrong backend URL. OpenAI removed the /backend-api/responses alias on chatgpt.com, and OpenClaw builds that still pointed baseUrl at https://chatgpt.com/backend-api ended up requesting /backend-api/responses, which Cloudflare serves as an HTML challenge page with HTTP 403. OpenClaw's classifier then surfaces that HTML as an auth scope failure or, in other cases, as DNS lookup for the provider endpoint failed or API rate limit reached (#66633). The token was fine; the path was wrong.

The canonical fix landed in PR #69336, which points the built in provider at https://chatgpt.com/backend-api/codex so the SDK builds .../backend-api/codex/responses, which still returns normal SSE streaming. A consolidated follow up in PR #67635 also normalizes cases where model.api was already openai-codex-responses but a persisted agent level models.json still contained the legacy https://chatgpt.com/backend-api/v1 base URL (#67131). In that shape, requests went to .../v1/codex/responses and hit the same Cloudflare wall.

When you see 403 or DNS errors on Codex after an upgrade, check three things before you re login. First, update to a build that includes #69336 or later. Second, open ~/.openclaw/agents/main/agent/models.json (path varies by agent id) and look for a stale baseUrl under openai-codex. Third, confirm the provider transport mapping uses "api": "openai-codex-responses", not "anthropic-messages" or an unset api that falls through to /chat/completions (#62087, #66969). Cloudflare began blocking /backend-api/codex/chat/completions for headless clients in 2026.4.14; the responses transport is the supported path.

You can sanity check the endpoint outside OpenClaw with curl against https://chatgpt.com/backend-api/codex/responses using a fresh OAuth bearer token. That token is the keys to your ChatGPT account, so pass it through a variable rather than pasting it inline, where it lands in your shell history and the process list, and keep it out of any log or ticket you share. If curl streams JSON but the gateway still 403s, the mismatch is almost always persisted model metadata or an old gateway binary, not your subscription.

401 after login

A frustrating auth bug: you complete a fresh Codex OAuth login, OpenClaw reports the login succeeded, and then the very first agent request fails with a 401 token validation error (#82231). The login flow and the request path disagree about the token. When this hits, re authenticate the Codex profile cleanly rather than retrying the same request, confirm the gateway process is the one holding the freshly minted token (a login in your shell does not always reach a background daemon), and update if you are behind, since the auth handshake here has been patched more than once.

If you maintain multiple auth profiles, make sure the agent is actually selecting the Codex one and not silently falling back to another provider. The Codex CLI itself documents device auth and login refresh flows in the Codex CLI docs; running codex login in the same environment the gateway uses is a good isolation test. If CLI login works but OpenClaw still 401s, compare the token store path the gateway reads against the one your interactive shell wrote.

Do not confuse this with the Cloudflare 403 case above. A true 401 from token validation and a 403 HTML page misread as "re authenticate" need different fixes. Check the raw HTTP status in gateway debug logs before you loop OAuth again.

Unknown model

A recurring shape of Codex bug is a model you can configure but cannot actually run. The clean example was openai-codex/gpt-5.4 being configurable yet failing at runtime as an unknown or unsupported model (#37623), because the runtime support for a new model lags the config schema that accepts its name. If you set a brand new model and every request dies with an unknown model error, check whether that model is actually wired into the Codex runtime in your version before assuming a typo. The fix is almost always updating to the release that added real support, and in the meantime running a model the runtime supports.

OpenClaw's model catalog and the Codex binary's supported model list do not always ship in the same release. After OpenAI ships a new Codex model, there is often a gap of one or two OpenClaw versions before the harness passes it through cleanly. Watch the OpenClaw release notes and the Codex repo's model alias table rather than guessing from the config UI alone.

Tool statuses as failures

A subtler current issue: Codex native tool statuses surfacing expected checks as channel visible OpenClaw failures (#88332). The agent is working, but normal intermediate tool states are being rendered to your chat as if they were errors, which makes a healthy run look broken. If your Codex agent appears to be throwing errors mid task but still produces the right result, this is likely what you are seeing, and it is a display issue in the harness rather than a real failure.

Compare the final artifact (file written, message sent, cron output) against the intermediate status lines. If the outcome is correct, you can usually ignore the red status noise until the rendering fix lands, and you should still report it with a screenshot so maintainers can distinguish cosmetic surfacing from a true tool failure.

Windows and the beta harness

If you run the beta channel on Windows, the Codex harness has hit import breakage where a removed export was still imported by @openclaw/codex (#86087). This is the sort of break that only appears on a specific platform and channel combination. If you are on Windows beta and Codex will not start at all, try the stable channel before you dig into your own setup, since beta is where these land first.

Platform specific import errors show up at gateway startup, not mid turn. If Codex never launches, check the gateway stderr for module resolution failures before you touch OAuth or model config.

Debugging order

When Codex breaks, walk this sequence before you change unrelated settings. Confirm your OpenClaw version and whether you recently upgraded. Read openclaw logs --follow during a failing turn and note whether the failure is transport (403, DNS misreport), auth (401), harness idle timeout (turn/completed), or model catalog (unknown model). If transport, inspect models.json base URLs and api fields, then update to a build with #69336.

If auth, re login through the gateway's Codex profile path and verify the daemon holds the token. If harness timeout, pin version or adjust turnCompletionIdleTimeoutMs on a build that supports it. If unknown model, downgrade to a known good alias until your version catches up.

That order matters because the symptoms overlap. A Cloudflare block and a stale OAuth token both encourage you to "log in again," but only one of them fixes the 403 HTML path.

The managed option

Every issue above is in the harness and OAuth layer that exists specifically so you can bring your own ChatGPT or Codex subscription. That is a real benefit when you already pay for Codex and want to run the agent on it, and if that is you, version pinning, clean re auth, and checking persisted models.json paths are the tools that keep it stable.

If you reached for the Codex backend mainly so the agent would have a capable model at all, there is a simpler arrangement. Operator.io runs OpenClaw with a managed frontier model included in the plan, so there is no Codex OAuth to complete, no app server harness to stall, and no subscription login to keep alive. The agent has a working model the moment you sign in. OpenClaw itself stays model agnostic, so OpenAI, Anthropic, and Google models are all options if you self host and want to choose; the managed path simply removes the wiring. You can try Operator free, or run your own Codex backend and use the fixes above when the harness acts up.

Frequently asked questions

Why does OpenClaw say "Codex stopped before confirming the turn was complete"?

+

The Codex app server starts a turn, the agent does its work, and the harness never receives the completion signal, so OpenClaw reports the stall (issue #88312). It is a regression that was fixed once and returned in 2026.5.27, so the move is version management rather than rewriting config. Pin back to the last release where Codex turns completed cleanly for you, or move forward once a fix lands, and watch the issue thread.

Why does Codex 401 right after a successful login?

+

The login flow and the request path disagree about the token, so a fresh Codex OAuth login reports success and the first agent request fails with a 401 (issue #82231). Re authenticate the Codex profile cleanly rather than retrying the same request, confirm the gateway process is holding the freshly minted token rather than just your shell, and update if you are behind, since this handshake has been patched more than once.

Why is a model my config accepts failing as "Unknown model"?

+

Runtime support for a new model can lag the config schema that accepts its name, so you can set a model like openai-codex/gpt-5.4 and have every request die as an unknown model (issue #37623). Check whether the model is actually wired into the Codex runtime in your version before assuming a typo. The fix is almost always updating to the release that added real support, and running a supported model in the meantime.

Why do Codex requests fail with Cloudflare 403 or "DNS lookup failed" when my token is valid?

+

OpenClaw is often hitting a legacy ChatGPT backend path that Cloudflare blocks for non browser traffic, and the HTML 403 gets misclassified as auth or DNS failure (issue #66633, fixed in PR #69336). Update to a build that uses https://chatgpt.com/backend-api/codex as the base URL, confirm api is openai-codex-responses, and inspect ~/.openclaw/agents/main/agent/models.json for a stale /backend-api/v1 entry (issue #67131).