codex-rs/responses-api-proxy/README.md
A strict HTTP proxy that only forwards POST requests to /v1/responses to the OpenAI API (https://api.openai.com), injecting the Authorization: Bearer $OPENAI_API_KEY header. Everything else is rejected with 403 Forbidden.
IMPORTANT: codex-responses-api-proxy is designed to be run by a privileged user with access to OPENAI_API_KEY so that an unprivileged user cannot inspect or tamper with the process. Though if --http-shutdown is specified, an unprivileged user can make a GET request to /shutdown to shutdown the server, as an unprivileged user could not send SIGTERM to kill the process.
A privileged user (i.e., root or a user with sudo) who has access to OPENAI_API_KEY would run the following to start the server, as codex-responses-api-proxy reads the auth token from stdin:
printenv OPENAI_API_KEY | env -u OPENAI_API_KEY codex-responses-api-proxy --http-shutdown --server-info /tmp/server-info.json
A non-privileged user would then run Codex as follows, specifying the model_provider dynamically:
PROXY_PORT=$(jq .port /tmp/server-info.json)
PROXY_BASE_URL="http://127.0.0.1:${PROXY_PORT}"
codex exec -c "model_providers.openai-proxy={ name = 'OpenAI Proxy', base_url = '${PROXY_BASE_URL}/v1', wire_api='responses' }" \
-c model_provider="openai-proxy" \
'Your prompt here'
When the unprivileged user was finished, they could shutdown the server using curl (since kill -SIGTERM is not an option):
curl --fail --silent --show-error "${PROXY_BASE_URL}/shutdown"
stdin. All callers should pipe the key in (for example, printenv OPENAI_API_KEY | codex-responses-api-proxy).Bearer <key> and attempts to mlock(2) the memory holding that header so it is not swapped to disk.--port is not specified.POST /v1/responses (no query string). The request body is forwarded to https://api.openai.com/v1/responses with Authorization: Bearer <key> set. All original request headers (except any incoming Authorization) are forwarded upstream, with Host overridden to api.openai.com. For other requests, it responds with 403.{ "port": <u16>, "pid": <u32> }.--http-shutdown enables GET /shutdown to terminate the process with exit code 0. This allows one user (e.g., root) to start the proxy and another unprivileged user on the host to shut it down.codex-responses-api-proxy [--port <PORT>] [--server-info <FILE>] [--http-shutdown] [--upstream-url <URL>]
--port <PORT>: Port to bind on 127.0.0.1. If omitted, an ephemeral port is chosen.--server-info <FILE>: If set, the proxy writes a single line of JSON with { "port": <PORT>, "pid": <PID> } once listening.--http-shutdown: If set, enables GET /shutdown to exit the process with code 0.--upstream-url <URL>: Absolute URL to forward requests to. Defaults to https://api.openai.com/v1/responses.Authorization: Bearer <key> to match the Codex CLI expectations.For Azure, for example (ensure your deployment accepts Authorization: Bearer <key>):
printenv AZURE_OPENAI_API_KEY | env -u AZURE_OPENAI_API_KEY codex-responses-api-proxy \
--http-shutdown \
--server-info /tmp/server-info.json \
--upstream-url "https://YOUR_PROJECT_NAME.openai.azure.com/openai/deployments/YOUR_DEPLOYMENT/responses?api-version=2025-04-01-preview"
POST /v1/responses is permitted. No query strings are allowed.Authorization and Host). Response status and content-type are mirrored from upstream.Care is taken to restrict access/copying to the value of OPENAI_API_KEY retained in memory:
codex_process_hardening so codex-responses-api-proxy is run with standard process-hardening techniques.1024 byte buffer on the stack and copy "Bearer " into the start of the buffer.stdin, copying the contents into the buffer after "Bearer "./^[a-zA-Z0-9_-]+$/ (and does not exceed the buffer), we create a String from that buffer (so the data is now on the heap)..leak() on the String so we can treat its contents as a &'static str, as it will live for the rest of the process.mlock(2) the memory backing the &'static str.&'static str when building an HTTP request, we use HeaderValue::from_static() to avoid copying the &str..set_sensitive(true) on the HeaderValue, which in theory indicates to other parts of the HTTP stack that the header should be treated with "special care" to avoid leakage: