In partnership with

OpenAI Agent Platform: The Complete Technical Deep-Dive

OpenAI Agent Platform: The Complete Technical Deep-Dive

From DevDay 2025: How AgentKit, the Responses API, and MCP are reshaping production AI agent development

⚡ TL;DR for Busy Engineers

  • AgentKit launched at DevDay 2025 — complete toolkit for building production-grade AI agents
  • Agent Builder provides a visual drag-and-drop canvas (think n8n meets LLM orchestration)
  • Responses API unifies Chat Completions + Assistants API with built-in web search, file search, and computer use
  • MCP support enables connecting any Model Context Protocol server to your agents
  • Real results: Ramp built a buyer agent in hours; Klarna handles ⅔ of support tickets with agents
  • Scale: 800M weekly ChatGPT users, 4M developers, 6B tokens/minute processed

At DevDay 2025 on October 6th, OpenAI made their strategic pivot crystal clear: they're no longer just a model provider — they're building a complete AI platform. The centerpiece? AgentKit, a comprehensive toolkit that tackles the hardest problem in production AI: turning powerful LLM capabilities into reliable, deployable agent systems.

For those of us who've spent months wrestling with prompt iteration, custom orchestration logic, and cobbled-together evaluation pipelines, this is the infrastructure we've been waiting for.

800M Weekly Active Users
4M Active Developers
6B Tokens/Minute

The Problem AgentKit Solves

Building production agents has historically meant juggling a fragmented mess of tools — complex orchestration without versioning, custom connectors, manual evaluation pipelines, endless prompt tuning, and weeks of frontend work before you can even show something to users.

Sam Altman put it simply during the keynote: "This is all the stuff that we wished we had when we were trying to build our first agents."

AgentKit consolidates this chaos into four integrated components:

🔧 Agent Builder

A visual drag-and-drop canvas for designing agent workflows. Built on the Responses API, it supports preview runs, inline evaluation configuration, and full versioning. Start from templates (customer service bots, data enrichment, Q&A agents) or build from scratch. Think of it as Canva for building agents.

🔌 Connector Registry

Centralized management for how data and tools connect across OpenAI products. Admins can securely connect agents to internal tools and third-party systems through a control panel while maintaining security governance. Supports MCP servers out of the box.

💬 ChatKit

Embeddable chat interfaces that feel native to your product. Handles streaming responses, thread management, model thinking visualization, and in-chat experiences. Canva reported saving over two weeks of development time using ChatKit for their developer support agent.

📊 Enhanced Evaluations

Trace grading for end-to-end assessment of agentic workflows, automated prompt optimization based on human annotations, datasets for systematic testing, and even support for evaluating third-party models within the OpenAI Evals platform.

Technical Architecture: The Responses API

At the foundation of AgentKit sits the Responses API — a new primitive that combines the simplicity of Chat Completions with the tool-use capabilities of the Assistants API. This is the critical infrastructure piece that makes everything else possible.

What's Different?

The Responses API provides a unified design with several key improvements over Chat Completions:

Feature Responses API
Built-in Tools Web search, file search, computer use, code interpreter
Architecture Unified item-based design with simpler polymorphism
Streaming Intuitive streaming events with SDK helpers
Data Storage Optional storage on OpenAI for tracing and evaluation
Multi-turn Single API call can handle multiple tools and model turns
# Simple Responses API call with web search from openai import OpenAI client = OpenAI() response = client.responses.create( model="gpt-5", tools=[{"type": "web_search"}], input="What are the latest developments in quantum computing?" ) # Access text output directly with SDK helper print(response.output_text)

⚠️ Migration Note

OpenAI has indicated that new models will likely focus on supporting the Responses API, with many new features exclusively available there. If you're building new agent systems, start with Responses API rather than Chat Completions.

The Agents SDK: Open Source Orchestration

While Agent Builder provides visual orchestration, the Agents SDK (open source on GitHub) gives you programmatic control. It's the production-ready evolution of Swarm, OpenAI's experimental orchestration framework.

The SDK is built around four core primitives:

Agents

LLMs configured with instructions, tools, guardrails, and handoff capabilities. Any Python function can become a tool with automatic schema generation via Pydantic.

Handoffs

Specialized tool calls for transferring control between agents. The SDK manages transitions automatically based on model decisions.

Guardrails

Configurable safety checks for input/output validation. Can mask or flag PII, detect jailbreaks, and apply custom safeguards. Deployable standalone or via the guardrails library.

Sessions

Automatic conversation history management across agent runs — no more manually handling .to_input_list() between turns.

from agents import Agent, Runner # Define an agent with instructions and tools agent = Agent( name="Research Assistant", instructions="""You are a helpful research assistant. Use web search for current information. Cite your sources.""", tools=[web_search_tool, file_search_tool] ) # Run synchronously result = Runner.run_sync( agent, "What are the key findings from the latest IPCC report?" ) print(result.final_output)

Agent Loop Architecture

The SDK manages an iterative agent loop that continues until a final output is produced:

  1. Agent receives input and attempts to respond
  2. If the model returns tool calls or handoff requests, those are executed
  3. Results are appended to message history
  4. Loop continues until "final output" signal (no more tool calls/handoffs)

This Python-first design philosophy means you control flow using native constructs — loops, conditionals, function calls — rather than complex DSLs.

MCP Integration: The Connector Standard

Model Context Protocol (MCP) support is perhaps the most strategically important feature of AgentKit. MCP is an open specification for connecting LLM clients to external tools and resources, and OpenAI's adoption signals industry convergence on a standard.

How It Works

The Agents SDK supports multiple MCP transports:

Transport Use Case Where It Runs
Hosted MCP Zero infrastructure, full tool execution in OpenAI's cloud OpenAI infrastructure
Streamable HTTP Self-managed servers with low latency Your infrastructure
SSE Transport Server-Sent Events for streaming Your infrastructure
STDIO Local servers launched via command Local machine
from agents import Agent from agents.mcp import HostedMCPTool import os # Connect to hosted MCP with Google Calendar agent = Agent( name="Calendar Assistant", tools=[ HostedMCPTool( tool_config={ "type": "mcp", "server_label": "google_calendar", "connector_id": "connector_googlecalendar", "authorization": os.environ["GOOGLE_CALENDAR_AUTH"], "require_approval": "never" } ) ] )

Why This Matters: With MCP, you can connect your agents to any of the hundreds of existing MCP servers — filesystem access, database queries, CRM integrations, Figma, Slack, and more — without writing custom integration code. The ecosystem is growing rapidly.

Real-World Production Results

The most compelling evidence for AgentKit comes from production deployments:

🛒

Klarna — Customer Support

Built a support agent that now handles two-thirds of all customer tickets. Scaled from prototype to production handling millions of interactions.

💼

Clay — Sales Automation

Sales agent deployment led to 10x growth. Automated prospect research, enrichment, and outreach sequencing.

💳

Ramp — Buyer Agent

Went from blank canvas to functional buyer agent in just a few hours. Previously would have taken months of custom orchestration.

🛒

Albertsons — Retail Analytics

Agent analyzes sales patterns across 2,000+ stores and 37M weekly shoppers. When ice cream sales dropped 32%, the agent automatically analyzed seasonality, historical trends, and external factors to recommend display and advertising adjustments.

"Agent Builder transformed what once took months of complex orchestration, custom code, and manual optimizations into just a couple of hours. The visual canvas keeps product, legal, and engineering on the same page, slashing iteration cycles by 70%." — Ramp Engineering Team

Reinforcement Fine-Tuning for Agents

OpenAI also announced expanded capabilities for Reinforcement Fine-Tuning (RFT) specifically designed for agent workloads:

  • Custom tool calls — Train models to call the right tools at the right time for better reasoning chains
  • Available models — Generally available on o4-mini, private beta for GPT-5
  • Use case — When you need agents to make better tool selection decisions for your specific domain

This is particularly valuable when your agents need to choose between many possible tools and the selection logic is domain-specific enough that general prompting doesn't capture it well.

The Bigger Picture: Platform vs. Provider

DevDay 2025 revealed OpenAI's three-layer platform strategy:

Layer Components Purpose
Foundation Models GPT-5 Pro, GPT-5-Codex, Sora 2, efficiency variants Specialized intelligence for different workloads
Developer Infrastructure AgentKit, Apps SDK, ChatKit, Connector Registry Orchestration without custom infrastructure
Distribution ChatGPT's 800M users, App Store, Instant Checkout Captive audience for apps, agents, and commerce

This is the iOS playbook: control intelligence (Apple controlled chips/OS), provide developer tools (Xcode), own distribution (App Store). The result is a flywheel where each layer reinforces the others.

🎯 Strategic Implication

OpenAI is betting that portability matters less than distribution. Apps may technically work on Claude or Gemini, but 800M ChatGPT users create gravitational pull. Developers will build for the biggest audience first — and may never multi-home.

Practical Getting Started Guide

Ready to build? Here's the recommended path:

1. Choose Your Interface

  • Agent Builder (No-Code): Best for rapid prototyping, non-technical stakeholders, and straightforward workflows. Access at platform.openai.com/agent-builder
  • Agents SDK (Code-First): Best for complex orchestration, custom logic, and production systems. Available in Python, Node.js, and Go.

2. Install the SDK

# Python pip install openai-agents # Node.js npm install @openai/agents

3. Start Simple

from agents import Agent, Runner # Minimal agent with built-in web search agent = Agent( name="Assistant", instructions="You are a helpful assistant with web access.", tools=[{"type": "web_search"}] ) # Run it result = Runner.run_sync(agent, "Your query here")

4. Add Complexity Incrementally

  • Add custom tools (any Python function works)
  • Implement guardrails for safety
  • Set up handoffs for multi-agent workflows
  • Connect MCP servers for external integrations
  • Use tracing for debugging and evaluation

Key Takeaways for DevOps Engineers

If you're transitioning from DevOps to AI/ML engineering, here's how AgentKit fits into your skillset:

  • Infrastructure as Code → Agent as Code: The SDK's Python-first approach means you can version, test, and deploy agents using familiar CI/CD patterns
  • Monitoring → Tracing: Built-in tracing feeds into the OpenAI evaluation platform, plus integrations with Logfire, AgentOps, Braintrust for external observability
  • API Gateway → Connector Registry: Centralized management of agent-to-tool connections with admin controls
  • Container Orchestration → Agent Orchestration: Handoffs between agents mirror service-to-service communication patterns
• • •

What's Next

OpenAI is moving fast. The ChatGPT agent (launched July 2025) already combines Operator's browser interaction with deep research capabilities. The Apps SDK is creating an app store within ChatGPT. And Instant Checkout (September 2025) enables commerce directly in conversations.

The question isn't whether AI agents will become infrastructure — it's whether you'll be building on it or trying to catch up.

"This is the best time in history to be a builder." — Sam Altman, DevDay 2025

ResearchAudio.io — Daily AI research briefings for engineers transitioning from DevOps to AI/ML

The Tech newsletter for Engineers who want to stay ahead

Tech moves fast, but you're still playing catch-up?

That's exactly why 100K+ engineers working at Google, Meta, and Apple read The Code twice a week.

Here's what you get:

  • Curated tech news that shapes your career - Filtered from thousands of sources so you know what's coming 6 months early.

  • Practical resources you can use immediately - Real tutorials and tools that solve actual engineering problems.

  • Research papers and insights decoded - We break down complex tech so you understand what matters.

All delivered twice a week in just 2 short emails.

Keep Reading

No posts found