Scaling tool orchestration data will emerge different intelligence and LLMs

Tldr: We are only now gonna start to scale long term external orchestration, everything beforehand was mostly internal problem solving training with here and there a tool call. We don't actually know yet what scaling orchestration training produces. It might produce much better tool-using assistants that remain fundamentally reactive to human instructions. Or it might produce something with more emergent autonomy. My gut feeling tells me the second. For the first time I foresee in the near future (as soon as 2027-2028) a potential for a misaligned takeoff.

A year ago, a friend of mine who studied social science asked my opinion about AI 2027 and the prospect of a misaligned AI takeover. I laughed and said it was quite impossible given how the technology actually worked. An LLM works too stepwise, I told him. There's a prompt, the model predicts the next tokens, and then it "dies." There's no continuity between prompts — it can store some text in a database, but there's no persistent reasoning. It felt obviously safe.

With the recent agentic developments of the past few months, I'm starting to doubt that earlier understanding.

The first generation of LLMs, up through GPT-4, were essentially sophisticated text autocompleters. They were trained on internet data from web crawls, fine-tuned with RLHF to give them a chatbot flavor. They felt harmless, and they fit the description I gave my friend perfectly. Their capabilities were entirely bounded by the context window and the prompt-answer time window. Prompt in, completion out, done.

The second generation added reasoning capabilities. These models stopped feeling like pure autocompleters — they could search within their stored knowledge, chain thoughts together, and work through problems. The training data changed too: successful reasoning traces got folded back into training. But crucially, they were still bounded by the same constraints. They got more time to think and process, but at the end of the answer, they were still mostly gone. The capability was still internal to the model.

Now enter this third generation of agentic LLMs, which really took off with tools like Claude Code becoming increasingly capable. These don't feel like autocompleters. They don't even feel like reasoners. They're starting to feel like orchestrators. They aren't limited to their internals — they act as a connected system, coordinating tools and externals to achieve goals. What scares me most is the new type of training data we're now generating and collecting: succesful long term orchestration traces. They will allow us to scale orchestration kind of intelligence. This kind of intelligence is not bound to its internal. It changes to an external symbiotic type of intelligence. We are training them to externalize almost everything. And optimizing them to orchestrate all these externals over a long time. This feels like optimizing for a symbiotic system, very different from the simple internally optimized llms of today. It really feels like the equation of what the llm is processing, is changing. The llm becomes an orchestration engine of externals, which together make up the whole system. We know how reasoning autocompletion scales, we dont know how orchestration engines scale. I feel like different and new emergent capabilities might appear. We are basically for the first time scaling the prefrontal cortex of llms.

For the first time, I can genuinely foresee the path to an unaligned takeoff. Let alone all other harm AI can do in the hands of bad actors. And it makes me question whether labs should continue down this path. Is it not far safer to keep LLM problem solving mostly internal to its own parameters? Of all the AI companies, shouldn't Anthropic have been less loud with systems like claude code. They have been accelerating the most in this new paradigm of what is gonna be scaled.