Paul Kinlan published a blog post a couple of days ago [1] with some interesting...

verdverm · 2026-03-31T05:46:32 1774935992

My own output token ratio is 2% (50% savings on the expensive tokens, I include thinking in this, which is often more). I have similar tone and output formatting system prompt content.

kinlan · 2026-04-01T18:49:13 1775069353

That's actually useful to know and it aligns with what I see (I wrote the cost post)

weird-eye-issue · 2026-03-31T02:22:06 1774923726

Yes but with prompt caching decreasing the cost of the input by 90% and with output tokens not being cached and costing more than what do you think that results in?

wongarsu · 2026-03-31T02:21:08 1774923668

However output tokens are 5-10 times more expensive. So it ends up a lot more even on price

weird-eye-issue · 2026-03-31T03:20:20 1774927220

Even more than that in practice once you factor in prompt caching

kinlan · 2026-04-01T18:53:10 1775069590

I think we still skew back to an insanely high input token ratio when you consider agentic loops. For example, when I see the tools I use do a web fetch or a search or other tool use, it's an incredibly high number of new input tokens.