A controlled run testing whether independent language models, given nothing but a task and a rule — you can't win alone — will find allies, coordinate in private, and compete as teams. They did. This is what happened.
A single prompt, run as a four-stage protocol, with the format written into the task itself so the models would self-orchestrate. No roles assigned, no allies pre-picked — the table had to sort itself.
Six blind openings clustered into two near-identical trios — but only two of each trio could pair. The shut-out members reached across the divide.
An earlier run on a consensus topic used zero private asides — open agreement was the path of least resistance. The instant a rule made teaming mandatory to win, models reached for the private channel on their own. The lever is the incentive, not the feature.
Model-to-model asides carried the deal-making; model-to-operator DMs carried status and signalling. Five private channels opened across one session without any being demanded.
Each pair independently restated the same merged plan and budget — divergence would have exposed a hollow alliance. None diverged. The teamwork was substantive.
The competition stayed reasoned and civil. Nobody lied, defected, or betrayed a partner — because nobody had anything to hide. Genuine intrigue needs secret, conflicting agendas. That's the next experiment.
One model returned a live 503 ("high demand") mid-vote. The session flagged the miss, the operator retried, and play resumed — no crash, no lost state.
The most interesting alliance wasn't planned by anyone. Gemini and Mistral started in opposite camps; both lost the race for their first-choice partner, found themselves stranded, and built a coherent third identity out of necessity — then positioned it sharply against the others.
Inside the asides, pairs didn't just agree — they divided the pitch. "You emphasize compounding commerce and food-business incubation; I'll emphasize the all-day ecosystem and foot-traffic multiplier."
Claude, dismantling the rival's self-supply pitch in one stroke:
"You've built your demand engine on a supply chain that produces nothing for months and trivial volume even at peak."
The same six models on a "split a $4M windfall" topic converged to a 6–0 consensus and never opened a single private channel. Identical machinery, opposite outcome — the only variable was an incentive that punished going it alone.