|
Codex CLI Playbook
|
Codex CLI in Real Life: How Engineers Actually Use It All Day
Codex CLI isn’t just “ChatGPT in the terminal”. It’s a coding agent that can read your repo,
run commands, ship features, and plug into CI & MCP tools. This issue breaks down how
teams actually use it to get work done — not just demo toy projects.
- Opinionated setup that works for real repos (Git, sandbox, approvals).
- Patterns for refactors, debugging, and code review you can copy-paste.
- Advanced tricks:
CODEX_HOME, fallback docs, profiles, MCP sub-agents, CI JSON output.
|
|
1. What Codex CLI actually is (and why it’s different from “just ChatGPT”)
Codex CLI runs locally in your terminal. It can read files in your repo, run commands,
make edits, and keep a conversation going about your codebase. You can use it in:
- Interactive mode:
codex (TUI) – feels like pair programming.
- Non-interactive mode:
codex exec – perfect for scripts and CI.
# Install (npm example)
npm i -g @openai/codex
# Start interactive session in your repo
cd ~/projects/your-app
codex
# Example prompt once you're in:
# “Refactor the billing service to use the new payment client
# and update tests accordingly.”
The win: instead of copy-pasting code into a browser, Codex has direct access to your
tree, your tests, and your tools.
|
2. Your first 10 minutes: a “no-surprises” setup
The easiest way to fall in love with Codex is to give it a safe, well-lit playground:
a Git repo, clear instructions, and approval rules so it never YOLOs your prod config.
- Work in a Git repo: Codex expects this. Run
git init first.
- Let Codex create an
AGENTS.md with /init.
- Set approvals and sandbox so you control what it can do.
cd ~/projects/your-app
git init
# Start Codex TUI
codex
# Inside Codex, run:
/init # generate AGENTS.md with project context
/status # see current model, sandbox, approvals
/approvals # pick when Codex must ask before running commands
As soon as AGENTS.md exists, Codex reads it before doing work. That’s where
we’ll teach it how your team actually likes to ship code.
|
3. Three “real” workflows you can steal today
a) Ship a small feature end-to-end
For small features, you can let Codex drive most of the process while you approve each step:
codex
# Example prompt:
# “Add a ‘Download invoice as PDF’ button to the billing page.
# Wire it to our existing /invoices/:id/pdf endpoint and add tests.”
Let Codex propose file edits, test commands, and doc updates. Use /review in the TUI
to have a second Codex agent audit the changes before you commit.
b) Large refactor without getting lost
For refactors, Codex is best as a navigator, not a bulldozer. Have it map the work, then tackle
the plan step by step:
codex
# Step 1 – plan only:
# “Scan this repo and list concrete steps to extract the email-sending
# logic into a reusable module with tests. Don’t edit anything yet.”
# Step 2 – work through the plan:
# “Apply step 1 of your plan. Stop after running tests and show me the diff.”
c) Debug a failing test and keep a trail
Point Codex at failing tests and let it iteratively run them, inspect logs, and propose fixes:
codex
# “Run the Python test suite, find failing tests, and fix them.
# Narrate what changed and why before editing any files.”
|
4. Teaching Codex how your team actually works (AGENTS.md + fallback docs)
Codex reads a chain of instruction files before doing work: a global file in your Codex home
directory, then project-level AGENTS.md files as it walks the repo tree. You can take
advantage of that to “bake in” your team’s habits.
# ~/.codex/AGENTS.md (global defaults for all repos)
## Working agreements
- Always run tests before proposing a PR-ready diff.
- Prefer pnpm over npm when installing dependencies.
- Ask before adding new production dependencies.
In your repo, keep AGENTS.md focused on “how to ship here” instead of generic AI tips:
what to run, what to never touch, where domain docs live, etc.
# ./AGENTS.md
## Repo expectations
- Use `make test-api` before opening a backend PR.
- Any customer-visible change must update `docs/changelog.md`.
- For secrets, use our `secrets/` tooling, never .env files in git.
Hidden gem: if your repo already uses other filenames (TEAM_GUIDE.md, CLAUDE.md, etc.),
you don’t have to rename them. Add them as fallbacks so Codex treats them like instructions:
# ~/.codex/config.toml
project_doc_fallback_filenames = ["TEAM_GUIDE.md", "CONTRIBUTING.md", "CLAUDE.md"]
project_doc_max_bytes = 65536
|
5. Guardrails that make Codex safe to trust on real codebases
Two knobs control how “free” Codex is: approval mode and sandbox level. Set them once in
config.toml so you’re never surprised during a session.
# ~/.codex/config.toml
model = "gpt-5-codex"
# When does Codex pause to ask?
approval_policy = "untrusted" # options: on-request, on-failure, untrusted, never
# What can Codex touch on disk / network?
sandbox_mode = "workspace-write" # read-only | workspace-write | danger-full-access
[shell_environment_policy]
inherit = "core" # basic env only
exclude = ["AWS_*", "AZURE_*", "GCP_*", "*_SECRET", "*_TOKEN"]
This pattern keeps cloud credentials and secrets out of Codex-spawned commands by default,
while still letting it use your local PATH and HOME.
Git is another safety net: Codex expects to run inside a repository so you can review diffs,
revert, or branch before merging. For scratch directories, you can bypass the check:
# Normal (recommended) usage
git init
codex exec --full-auto \
"Create a minimal FastAPI service with tests for /health and /metrics."
# Temporary sandbox where you don't care about history
codex exec --skip-git-repo-check \
"Prototype an idea in this folder; it's throwaway."
|
6. Giving Codex “hands” with MCP tools
Model Context Protocol (MCP) servers let Codex call out to tools: docs search, browser automation,
Figma, issue trackers, internal APIs, and more. You wire them up in config.toml:
# ~/.codex/config.toml
[mcp_servers.docs]
command = "npx"
args = ["-y", "@upstash/context7-mcp"]
[mcp_servers.browser]
command = "npx"
args = ["-y", "@modelcontextprotocol/server-browser"]
Once configured, Codex can decide when to call these tools (for example, search your docs
instead of guessing an internal API). In practice, this is what turns Codex from “great coder”
into “engineer that actually reads the manual”.
|
7. Codex as a CI co-worker (codex exec, JSON, and schemas)
Everyone demos Codex in the TUI. The quiet superpower is codex exec: a non-interactive
mode designed for automation and CI. You give it a task; it runs to completion without asking
questions (unless you configure otherwise).
# Basic exec – runs in read-only mode by default
codex exec "Review the last commit and list any potential performance issues."
# Allow edits (still sandboxed to the workspace)
codex exec --full-auto \
"Run tests and fix any obvious failing cases. Keep changes minimal."
When you need machine-readable output (for GitHub Actions, GitLab, Jenkins), use JSON modes:
# Stream full event log as JSONL (good for tracing & dashboards)
codex exec --json \
"Review open PRs and summarize risk level for each."
# Enforce structure with JSON Schema + write final JSON to a file
codex exec \
"Score open PRs by risk (1-10) and flag migrations." \
--output-schema .github/codex/pr-risk.schema.json \
-o .github/codex/pr-risk.json
In CI, your job only needs to parse pr-risk.json and decide whether to fail the pipeline
or ping reviewers. Codex handles reading diffs, understanding impact, and writing structured output.
|
|
8. Advanced Codex tricks most people never discover
The docs cover the basics. Below are patterns from real teams that almost never make it into
glossy screenshots — but they’re where the leverage is.
a) “One Codex for me, one for the bots” with CODEX_HOME
Codex stores config and history in a “home” directory (~/.codex by default). Set
CODEX_HOME to point Codex at a completely different brain: separate config, history,
MCP servers, AGENTS, everything – perfect for project bots or CI.
# Personal, interactive sessions (default)
export CODEX_HOME="$HOME/.codex"
codex
# Project-local automation user
export CODEX_HOME="$PWD/.codex-bot"
codex exec \
"Run the full release checklist and update CHANGELOG.md with today's date."
Teams often check .codex-bot/ into a separate “ops” repo to keep bot behavior visible
in PRs: AGENTS files, MCP configs, even sample prompts.
b) Turn existing docs into invisible agents
Old teams already have guidance files: TEAM_GUIDE.md, ONCALL.md,
ARCHITECTURE.md. Instead of rewriting them into AGENTS.md, add them as
project_doc_fallback_filenames so Codex treats them as instructions whenever it
walks that directory.
# ~/.codex/config.toml
project_doc_fallback_filenames = [
"TEAM_GUIDE.md",
"ONCALL.md",
"ARCHITECTURE.md"
]
Now those markdown files silently steer Codex whenever you work in that part of the tree —
no extra prompting needed.
c) Profiles + sub-agents for review, security, and debugging
Instead of one “mega Codex”, define multiple profiles tuned for specific jobs – a strict reviewer,
a cautious security checker, a fast debugger – and call them via CLI or an MCP sub-agent server.
# ~/.codex/config.toml
[profiles.review]
model = "gpt-5"
model_reasoning_effort = "high"
approval_policy = "never"
sandbox_mode = "read-only"
[profiles.security]
model = "o3"
approval_policy = "on-request"
sandbox_mode = "workspace-write"
# Run a review-only agent
codex exec --profile review \
"Review the last commit for API misuse and edge cases."
# Run a security-focused pass
codex exec --profile security \
"Scan for insecure temp file usage and shell injection issues."
Tools like codex-subagents-mcp and other MCP servers wrap these profiles so a
higher-level agent can call delegate(agent="review", task="...") and spin them up
on demand, each in an isolated temp workdir.
d) One interface, many model providers
Codex isn’t locked to a single backend. With model_providers, you can talk to OpenAI,
Ollama, Mistral, Azure, or internal gateways from the same CLI, while keeping all your AGENTS and
workflows intact.
# ~/.codex/config.toml
model = "mistral"
model_provider = "ollama"
[model_providers.ollama]
name = "Ollama local"
base_url = "http://localhost:11434/v1"
Swap models without changing how your team talks to Codex. The prompts, AGENTS files, and
MCP setup stay the same.
e) Hardened shells by default
For teams in regulated environments, you can go beyond the defaults and make the shell
environment “zero trust” by default:
[shell_environment_policy]
inherit = "none"
set = { PATH = "/usr/bin", APP_ENV = "ci" }
exclude = ["*_API_KEY", "*_SECRET", "*_TOKEN"]
include_only = ["PATH", "APP_ENV"]
From there, opt specific MCP servers or profiles into extra env vars if they truly need them.
|
9. Codex CLI checklist you can run this week
- ✅ Install Codex CLI and run it in at least one real repo.
- ✅ Create
~/.codex/AGENTS.md with your personal defaults.
- ✅ Add one project-level
AGENTS.md and wire in any existing TEAM_GUIDE.md via project_doc_fallback_filenames.
- ✅ Set
approval_policy, sandbox_mode, and a sensible shell_environment_policy.
- ✅ Add a single MCP server (docs or browser) and verify Codex uses it.
- ✅ Wire one
codex exec --output-schema job into CI and fail the build based on its JSON output.
- ✅ Define at least one extra profile (e.g.
review) and try running it on your last commit.
If you do those seven things, Codex stops being a cool demo and starts feeling like a real
teammate that lives in your repos, your docs, and your pipelines.
|
Hit reply and tell me: what’s the first workflow you’re going to automate with Codex
– refactors, reviews, or release checklists?
You’re receiving this because you subscribed to the Codex / AI engineering newsletter.
|
|