In partnership with

Talk to your AI tools the way you'd talk to a colleague.

You don't send a colleague a three-word brief. You explain the context, the constraints, what you've already tried. But typing all that into ChatGPT takes forever — so you don't.

Wispr Flow lets you speak your prompts instead. Talk through your thinking naturally and get clean, paste-ready text. No filler words. No cleanup. Just detailed prompts that actually get you useful answers on the first try.

Millions of users worldwide. Works system-wide on Mac, Windows, and iPhone.

Try Wispr Flow free

ResearchAudio.io

Anthropic Shipped Wall Street as Markdown Files

10 agents, 11 data connectors, no build step. The repo is the reference architecture.

Anthropic released ten finance agent templates this week. The notable part for an AI engineer is not the finance content, it is that the entire repo is plain markdown and JSON, with the same agent definition runnable as a chat plugin or as an autonomous API call.

The repo crossed 8,600 stars and 1,200 forks within roughly twenty-four hours of publication. The structure under plugins/agent-plugins/ is the most cited reference architecture for vertical agents in any domain.

What shipped

Ten named workflow agents grouped by function. Coverage and advisory: Pitch Agent (comparables, precedents, transaction modeling, branded deck output) and Meeting Prep Agent (briefing pack before every client meeting). Research and modeling: Market Researcher, Earnings Reviewer, Model Builder. Fund admin and finance ops: Valuation Reviewer, GL Reconciler, Month-End Closer, Statement Auditor. Operations and onboarding: the Know-Your-Customer Screener.

Seven vertical plugins bundle the skills and slash commands by domain. The financial-analysis core ships comparables analysis, cash-flow valuation, leveraged-acquisition models, three-statement modeling, and Excel formula audit. Investment banking adds confidential information memorandum drafting, teasers, lists of strategic and financial acquirers, and merger-model accretion analysis. Equity research adds earnings notes, initiations, model updates, and morning meeting commentary. Private equity adds sourcing, investment committee memos, returns analysis, and value-creation plans. Wealth management adds financial planning, rebalancing, and tax-loss harvesting. Slash commands like /comps, /dcf, /lbo, /earnings, /ic-memo, /screen, and /tlh fire on intent.

Eleven Model Context Protocol connectors are centralized in the financial-analysis core plugin: FactSet, Moody's, Morningstar, S&P Global, PitchBook, the London Stock Exchange Group, Daloopa, Aiera, MT Newswires, Chronograph, and Egnyte. Moody's launched alongside the release as a Model Context Protocol app that brings credit ratings and reference data on more than 600 million public and private companies into the agent context for credit analysis and compliance work. Microsoft 365 add-ins for Excel, PowerPoint, and Word are generally available, with Outlook in beta. Production customers listed at launch include JPMorganChase, Goldman Sachs, Citi, and Visa. Walleye Capital reports that 100% of its 400-person hedge fund uses Claude Code internally.

The accuracy floor matters. Claude Opus 4.7 leads Vals AI's Finance Agent benchmark at 64.37%, which is the industry leader and also a 35.63% failure rate. Anthropic frames the trajectory as a "staircase of autonomy" that mirrors what software engineering went through, with finance roughly six months to a year behind coding on the curve. The repo states explicitly that these agents draft analyst work product for review, and do not make investment recommendations, execute transactions, post to a ledger, or approve onboarding. The architecture is built around that constraint.

The 3-part anatomy

Each agent template packages three components.

Skills are markdown files describing workflows. They fire automatically when relevant context appears in the session. A skill file is plain frontmatter plus prose: a name, a description, and step-by-step methodology written for Claude to follow. Examples include comps-analysis, dcf-model, earnings-analysis, and ic-memo. Skills live once in plugins/vertical-plugins//skills/ and get bundled into the agents that need them via scripts/sync-agent-skills.py. Authored once per vertical, copied into each agent at sync time, with check.py validating that bundled copies have not drifted from the vertical source.

Connectors are Model Context Protocol servers that wire Claude to external data: research platforms, market data terminals, document stores. All eleven providers expose standard protocol endpoints centralized in the core .mcp.json manifest, so adding a new data source is a config change, not a code change. Each agent inherits the full connector pool from the financial-analysis core, then narrows tool permissions on a per-agent basis through the Managed Agent wrapper.

Subagents are specialized Claude calls invoked by the main agent for sub-tasks like comparables selection, methodology checks, or document parsing. Anthropic notes that this delegation pattern (callable_agents) is still a research preview, but it ships in production templates today as depth-1 leaf workers. The orchestrator hands off via a handoff_request event; the leaf-worker runs to completion and returns a result the orchestrator consumes as new context.

A concrete example is the Know-Your-Customer Screener. It bundles a kyc-rules skill that spells out how Claude should apply a firm's onboarding rules to a parsed record. The skill instructs the model to assign a risk rating, check documents for completeness, cite the rule outcomes, and produce a structured JSON result that downstream corporate systems can ingest. The agent does not file the onboarding decision; it stages a recommendation for a human reviewer.

Anatomy of one agent template

Skills

markdown workflow files, auto-fire on context

Connectors

Protocol servers, governed access

Subagents

callable_agents, depth-1 leaf workers

→

Agent

agents/.md

one definition file plus its bundled skills

→

Cowork plugin

interactive, runs in desktop app alongside the analyst

Managed Agent

autonomous, posts to /v1/agents with audit logs

same definition, two surfaces

Source: anthropics/financial-services repo structure

Inside the repo

The directory layout makes the abstraction concrete.

plugins/
  agent-plugins/               # Named agents, one self-contained plugin each
  vertical-plugins/            # Skill and command bundles by domain
  partner-built/               # Partner-authored plugins (data providers)
managed-agent-cookbooks/       # Claude Managed Agent cookbooks, one per agent
claude-for-msft-365-install/   # Admin tooling for the Microsoft 365 add-in
scripts/                       # deploy, validate, orchestrate, sync helpers

Three observations. First, everything is text files. There is no build artifact, no compiled binary, no opaque dependency graph. Second, agents and verticals live side by side in plugins/, with managed-agent wrappers in a separate sibling directory so the autonomous deployment surface is a clean overlay rather than a fork. Third, the scripts/ folder ships the operational tooling: deploy-managed-agent.sh handles Managed Agent deployment, sync-agent-skills.py keeps bundled skill copies aligned with vertical sources, and orchestrate.py provides a reference event loop for multi-agent handoffs.

Two deployment modes from one definition

The same agents/.md file plus its bundled skills runs in two modes. As a Cowork plugin, the agent runs alongside the analyst inside their existing desktop software. The analyst can also assign tasks remotely through Cowork Dispatch, which accepts text or voice input and routes work to Claude even when the user is away from the desktop. As a Claude Managed Agent, the same template runs autonomously on the Claude Platform via /v1/agents, with agent.yaml, leaf-worker subagents, steering events, per-tool permissions, managed credential vaults, and audit logs in the Claude Console.

The deploy script (scripts/deploy-managed-agent.sh) resolves file references, uploads skills, creates the leaf-worker subagents, and posts the orchestrator to the API. Same system prompt, same skills, two execution surfaces. The cookbooks include long-running sessions designed to span a multi-hour transaction close or a nightly batch schedule, with full audit logs surfacing every tool call, prompt, and subagent handoff.

The split matters for observability. Cowork sessions are interactive: the analyst sees each step and intervenes when something looks wrong. Managed Agent sessions are autonomous: the trail of tool calls and handoff events in the Console is the sole artifact a reviewer has after the fact. The same agent definition that a user can correct in real time becomes, on the API surface, an audit-driven system with no human in the loop until the result is staged for review.

The orchestration loop

scripts/orchestrate.py is a reference event loop that routes handoff_request events between agents through your own orchestration layer. The pattern is simple: the main agent emits a handoff event with a target subagent name and a payload; the orchestrator looks up the leaf-worker, forwards the payload, awaits the result, and feeds the response back as a tool result the main agent continues from.

Steering events let an operator inject context mid-run without restarting the session. The operator can append a directive, a clarifying instruction, a new constraint, or an updated parameter, and the main agent picks it up on the next reasoning step. This is the production hook for human-in-the-loop control: the workflow keeps running, but the human can correct course at any handoff boundary without losing session state.

What this means for AI engineers

Insight 1. The repo is a portable specification, not a framework. There is no build step. Markdown plus JSON, with short config files. Anyone working on a vertical agent in a different domain (legal, medical, operations, support) can fork the structure under plugins/agent-plugins/ and replace the contents. The same pattern that ships ten finance agents on day one will ship ten legal agents the moment a firm clones the repo and rewrites the markdown.

Insight 2. The skill versus command split formalizes implicit and explicit invocation. Skills fire from context. Commands fire from intent (/comps, /dcf, /earnings, /ic-memo). Both reduce to the same skill files, just different entry points. A useful pattern for any agent surface that mixes ambient assistance with explicit user actions, and a clean separation that lets you expose the same capability to power users (commands) and casual users (ambient context).

Insight 3. The 64.37% benchmark is why the architecture is human-in-the-loop. Designing a vertical agent that bypasses review at this accuracy level is a product decision, not a technical one. Anthropic also published a separate claude-for-msft-365-install/ plugin so firms can route model traffic through their own Vertex AI, Bedrock, or internal LLM gateway, keeping audit and policy control in their tenant. This is the deployment story enterprises actually need: model access via approved cloud, identity via Microsoft Graph, observability via the firm's existing logging stack.

Insight 4. Subagent delegation via callable_agents is still a research preview, but it has shipped as a public dependency in production templates. AI engineers who need this orchestration today have a working reference for the request shape, the steering event format, and the leaf-worker pattern in scripts/orchestrate.py. Building on a research preview is a real risk, but it is also the path to shipping multi-agent orchestration before the API stabilizes.

Insight 5. Excel is the new agent surface. Three of the ten templates (Model Builder, GL Reconciler, Month-End Closer) write directly into spreadsheets. The Microsoft 365 add-in carries context across Excel, PowerPoint, Word, and Outlook, so a session that begins with a model build can end as a finished deck without re-explaining the inputs. For AI engineers, this answers a long-standing question about where complex agents actually run: not in a chat window, but inside the productivity surface the user already lives in.

Closing

Forking this repo is the fastest path to a sane vertical-agent baseline in any domain. The structure, not the finance content, is the contribution. Whether or not your agent ships into a bank, the choices Anthropic made here (file-based, two-mode, skill-versus-command, callable subagents, productivity-suite integration) are the choices any production vertical agent will end up making.

ResearchAudio.io

Sources: Anthropic announcement, anthropics/financial-services repo, Claude Managed Agents docs, Model Context Protocol.

Anthropic Shipped Wall Street as Markdown Files

Talk to your AI tools the way you'd talk to a colleague.

Anthropic Shipped Wall Street as Markdown Files

What shipped

The 3-part anatomy

Inside the repo

Two deployment modes from one definition

The orchestration loop

What this means for AI engineers

Closing

Keep Reading

Quick Links

Stay Updated