In partnership with

The Data Format Quietly Saving AI Companies Millions
```

The Data Format Quietly Saving AI Companies Millions

While everyone debates prompt engineering, smart engineers discovered something far more lucrative


The $96,000 Question Nobody Asked

Last month, an AI startup processing a million customer support conversations hit a wall. Their OpenAI bill: $20,000. Not sustainable. Not even close.

The founder did what most do—tried to optimize prompts, reduce context, compress outputs. Saved maybe 10%. Still hemorrhaging cash.

Then their senior engineer asked a different question: "What if the problem isn't what we're sending, but how we're formatting it?"

Three weeks later, same traffic, same prompts. Bill: $12,000.

Annual impact: $96,000 saved. Zero functionality lost.

The Format You Never Learned in Computer Science

JSON has been the king of data interchange for 20 years. We learn it early, use it everywhere, trust it completely.

But JSON was designed for machines talking to machines. It was never optimized for the way large language models actually tokenize text.

Consider this simple user record:

``` {
  “users”: [
    { “id”: 1, “name”: “Alice”, “role”: “admin” },
    { “id”: 2, “name”: “Bob”, “role”: “user” }
  ]
}
```

Looks clean. Professional. Standard.

GPT-4 tokenizer: 28 tokens.

But look closer. How many tokens are actual information? The ids, names, roles—maybe 12 tokens. The rest? Curly braces. Quotation marks. Colons. Commas. Repeated field names.

You're paying for punctuation.

Enter TOON: The Format Born from Frustration

TOON (Token-Oriented Object Notation) emerged from the AI engineering community in late 2024. Not from a big tech company. Not from academia. From engineers watching their AWS bills balloon.

The insight was elegant: LLMs don't parse JSON the way browsers do. They tokenize it. So why not design a format optimized for tokenization instead of parsing?

Same data, TOON format:

``` users[2]{id,name,role}:
1,Alice,admin
2,Bob,user
```

18 tokens.

Same information. 35% fewer tokens. And here's what surprised everyone: LLMs understood it better.

The Benchmarks That Changed Minds

Talk is cheap. Engineers need proof. So the TOON team built comprehensive benchmarks: 209 questions across 4 different LLMs, testing whether models could actually understand the format.

Results:

TOON: 73.9% accuracy | 2,744 tokens
JSON: 69.7% accuracy | 4,545 tokens

Not only did TOON use 39.6% fewer tokens—it actually improved comprehension accuracy by 4.2 percentage points.

The reason? TOON's explicit structure. The [2] declares array length upfront. The {id,name,role} lists fields once. LLMs could validate data structure before processing content.

Where the Magic Really Happens

Small examples are cute. Production data is where TOON becomes transformative.

Take a time-series dataset: 60 days of analytics with views, clicks, conversions, revenue, bounce rates.

Format Tokens Savings
JSON 22,250
YAML 17,863 19.7%
TOON 9,120 59.0%

13,130 tokens saved on a single dataset.

Now imagine you're running analytics queries 1,000 times per day. That's 13 million tokens saved daily. At typical API pricing, you're looking at $4,000 per month in savings on this one workflow alone.

The Types of Data Where TOON Dominates

TOON isn't a universal solution. It has a sweet spot: uniform arrays of objects.

Think about what production AI systems actually work with:

  • Database query results – Same schema across all rows
  • API responses – Product catalogs, user lists, order histories
  • Agent logs – Consistent fields per event
  • Time-series data – Metrics, analytics, monitoring data
  • Test results – Status, duration, error messages

These structures dominate real-world AI applications. And they're exactly where TOON excels.

Benchmarks on uniform employee records (100 rows):

JSON: 126,860 tokens
TOON: 49,831 tokens
Savings: 60.7% (77,029 tokens)

When TOON Fails (And Why That Matters)

Here's what separates hype from reality: knowing the limitations.

TOON performs worse than compact JSON on deeply nested configuration files. The benchmarks don't hide this:

Nested config test:
JSON compact: 564 tokens
TOON: 631 tokens
TOON loses by 11.9%

Why? TOON's tabular format requires uniform structure. When you have irregular nesting, it falls back to YAML-like indentation—and loses its efficiency advantage.

This honesty is valuable. Use TOON for what it's designed for: uniform data at scale. Use JSON for everything else.

How to Actually Implement This

Theory is interesting. Implementation is money.

TOON has production-ready libraries in TypeScript and JavaScript. Python, Go, and Rust implementations are in active development. Community versions exist for Ruby, PHP, Elixir, and more.

Quick start (no installation):

``` npx @toon-format/cli data.json –stats
```

This shows you token comparison instantly. No commitment. Just data.

In your application:

``` import { encode } from ‘@toon-format/toon’

const data = await fetchDatabaseResults()
const toonFormat = encode(data)

// Send to LLM
const response = await claude.messages.create({
  messages: [{
    content: `Analyze:\n\```toon\n${toonFormat}\n````
  }]
})
```

Three lines of code. That's the barrier between current costs and 40-60% savings.

The Hidden Benefit Nobody Talks About

Cost savings get attention. But there's a second-order effect that might matter more: context window efficiency.

Modern LLMs have large context windows—128K, 200K tokens. But filling that window costs money. The question isn't "how much can we fit?" but "how much useful information can we fit?"

Example: You're building a RAG system for customer support. You need to retrieve relevant past conversations as context.

With JSON: You can fit 50 conversations in your 10K token budget.
With TOON: You can fit 125 conversations in the same budget.

2.5× more context means better recommendations, more accurate responses, higher customer satisfaction. The ROI isn't just cost savings—it's product quality.

Multi-Agent Systems: Where This Gets Ridiculous

If you're building multi-agent systems, TOON isn't optional—it's essential.

Consider a three-agent pipeline: one agent analyzes data, another classifies results, a third generates a report. Each handoff requires structured data.

JSON pipeline:
450 tokens per handoff × 3 = 1,350 tokens per workflow

TOON pipeline:
180 tokens per handoff × 3 = 540 tokens per workflow

810 tokens saved per workflow. Run 10,000 workflows per month? That's 8.1 million tokens. $81 saved monthly on agent communication alone.

But the real win is speed. Fewer tokens means faster processing. Your agents complete tasks quicker. You can handle more throughput with the same infrastructure.

The AI Observability Angle

Here's a use case most people miss: monitoring and logging.

AI systems generate massive amounts of logs—every agent action, every token consumed, every decision made. If you want to build intelligence around your AI (anomaly detection, performance optimization, cost attribution), you need to feed these logs back into LLMs for analysis.

Log data is perfectly uniform: timestamp, agent_id, action, tokens, latency, status. Exactly what TOON was designed for.

100 log events in JSON: 2,500 tokens
100 log events in TOON: 1,000 tokens

For real-time monitoring where you need maximum log coverage in your context window, TOON gives you 2.5× more observability data. Better anomaly detection because you can see more history.

Who Should Care About This

Not everyone needs TOON. Let's be honest about who benefits:

You should explore TOON if:

  • Your monthly LLM costs exceed $500
  • You're processing database results or API responses at scale
  • You're building multi-agent systems with high internal communication
  • You need to maximize context window utilization
  • You're working with time-series data, logs, or analytics

You can skip TOON if:

  • You're in early MVP stage with minimal volume
  • Your data is highly irregular and deeply nested
  • Your token costs are negligible (under $100/month)
  • You prioritize ecosystem compatibility over optimization

The Pragmatic Implementation Plan

You don't need to rewrite your entire stack. Start strategic:

Week 1: Audit your highest-volume API calls. Identify uniform data structures. Run the CLI tool on sample data. Calculate potential savings.

Week 2: Build a conversion wrapper. Test on 10% of traffic. Measure: token reduction, accuracy, latency. Compare costs side-by-side.

Week 3: If numbers hold, expand to 50%. Keep JSON as fallback. Monitor error rates.

Week 4: Full production if metrics improve. Document best practices. Share with team.

Total engineering time: 20-30 hours. Potential annual savings: $50K-$100K+ depending on scale.

What Makes This Different from Other Optimizations

Most token optimization is zero-sum. Shorter prompts mean less context. Compressed outputs mean less detail.

TOON is different. You lose nothing. Same data, same information, same functionality. Just encoded more efficiently.

It's not clever prompt engineering. It's not reducing quality. It's just... better tokenization.

The best optimizations are the ones users never notice.

The Broader Pattern This Reveals

TOON isn't just about saving money. It reveals something bigger: we're still thinking about AI systems with pre-AI assumptions.

JSON was designed for REST APIs in 2001. YAML for configuration files in 2002. These formats optimized for human readability and machine parsability.

LLMs don't parse—they tokenize. That's a fundamentally different constraint. TOON is what happens when you design from first principles for how modern AI actually works.

What other conventions are we following out of habit rather than necessity?

The Resources You Actually Need

Don't take my word for it. Go look at the data:

  • Official repo: github.com/toon-format/toon
  • Full spec: Read the v2.0 specification
  • Benchmarks: 209 questions, 11 datasets, 4 LLMs
  • Try it now: npx @toon-format/cli --help

Run your own numbers. Test on your data. Make the decision based on evidence, not hype.

Why This Matters Now

AI costs are only going one direction: up. Context windows are expanding. Usage is scaling. The companies that survive the next wave won't be the ones with the cleverest prompts—they'll be the ones with the best unit economics.

TOON is one piece of that puzzle. Not a silver bullet. Not a replacement for good engineering. Just a format designed for the world we actually live in.

The startup that saved $96,000? They're still using JSON for their public APIs. Still using PostgreSQL for storage. They just stopped paying for punctuation when talking to LLMs.

Sometimes the best innovations are the boring ones.


Have you tested TOON in production? Found unexpected use cases? Hit reply—I want to hear about it. Real-world data beats theory every time.

— Deep

ResearchAudio.io

AI Research Breakdowns for Builders
Practical engineering insights from the trenches

Unsubscribe Update Preferences View in Browser

```

74% of Companies Are Seeing ROI from AI

When your teams build AI tools with incomplete or inaccurate data, you lose both time and money. Projects stall, costs rise, and you don’t see the ROI your business needs. This leads to lost confidence and missed opportunities.

Bright Data connects your AI directly to real-time public web data, so your systems always work with complete, up-to-date information. No more wasted budget on fixes or slow rollouts. Your teams make faster decisions and launch projects with confidence, knowing their tools are built on a reliable data foundation. By removing data roadblocks, your investments start delivering measurable results.

You can trust your AI, reduce development headaches, and keep your focus on business growth and innovation.

Keep Reading

No posts found