Stuart Russell – Avoiding the Cliff of Uncontrollable AI (AGI Governance, Episode 9)

Joining us in our ninth episode of our AGI Governance series on The Trajectory is Stuart Russell, Professor of Computer Science at UC Berkeley and author of Human Compatible. He is among the earliest and most influential voices on AGI safety and governance, and a co-convener of the International Dialogues on AI Safety (IDAIS), launched in October 2023 with Yoshua Bengio, Andrew Yao, and Ya-Qin Zhang to bring together leading scientists worldwide to address AI risk.

In this episode, Stuart explores why current AI race dynamics resemble a prisoner’s dilemma, why governments must establish enforceable red lines, and how international coordination might begin with consensus principles before tackling more difficult challenges. He also discusses the role of public awareness and culture in shifting policy attention toward existential risk.

Stuart notes that large language models have already changed the conversation, with hundreds of millions of people experiencing what feels like “general-purpose intelligence on tap.” At the same time, he warns that leading companies are pouring hundreds of billions into AGI development, aiming at systems that exceed human capabilities across the board.

I hope you enjoy this conversation with Stuart:

Subscribe for the latest episodes of The Trajectory:

Below, I’ll summarize Stuart’s main points from each of the four sections of our interview.

AGI Governance Q-and-A Summary – Stuart Russell

1. What should AGI governance attempt to do?

Stuart believes the central aim of AGI governance is to prevent catastrophic risks and ensure that advanced systems remain firmly under human control. The role of governance, he stresses, is to protect humanity from scenarios where powerful systems act in ways we cannot contain or reverse.

He warns that governments face an urgent and unprecedented challenge. Leading AI companies are investing staggering sums, possibly as much as half a trillion dollars this year, and even more next year, not in simple chatbots but in systems that could reliably exceed human capabilities across every domain. Because those domains include the ability to conduct AI research, such systems could rapidly accelerate their own improvement, inventing new methods, architectures, and even hardware designs. Without governance, that self-improvement loop could quickly escape human oversight.

Stuart notes that many policymakers still see these warnings as speculative or “sci-fi.” Yet when the very founders of leading labs admit there is a 20% chance of human extinction, governments can no longer dismiss the risk. One company leader even told him that his best-case scenario was a disaster on the scale of Chernobyl, severe enough to force governments to act. For Stuart, the role of governance is clear: to act before catastrophe, not after.

2. What might AGI governance look like in practice?

Stuart points to two models of international coordination. One, like the IAEA for nuclear power, involves direct inspections. The other, which he sees as more feasible, mirrors the International Civil Aviation Organization: nations agree to standards, write them into law, and regulators interact with an international body to maintain compliance.

He acknowledges that progress on coordination will be difficult in today’s geopolitical environment. Still, he argues that without agreement, the result is chaos – much like if drivers could not agree on which side of the road to use. In his view, effective governance means giving up a little bit of freedom of choice to make the system work for everyone, though he notes that there are some parties currently who really object to that way of thinking.

Finally, he notes that AGI governance is beginning to gain traction even within formal international bodies. For example, the United Nations General Assembly is scheduled to discuss creating a new agency dedicated specifically to governing AGI. While still in the early stages, Stuart views this as a sign that the issue has entered the window of political acceptability at the global level.

3. What should innovators and regulators do now?

For innovators, Stuart describes the current dynamic among leading labs as a prisoner’s dilemma. Everyone would be better off slowing down until safety is solved, but each actor feels compelled to keep racing. He notes that it is permissible to coordinate on safety in the public interest, including speaking with one another and with governments, without violating antitrust laws. You are allowed to talk to each other if it’s in the public interest and not colluding to prevent competition, and you can certainly speak to governments to warn them that if the race continues, it will be a race off a cliff.

He also urges innovators to recognize when a technology path is fundamentally unsafe and be willing to abandon it. To illustrate this, he uses the analogy of “giant bird airlines.” In his talks, he shows an image of an enormous eagle with a 200-foot wingspan carrying a passenger fuselage. In this fictional scenario, airlines once bred massive birds to carry people, but the birds kept eating passengers, dropping them into the ocean, or feeding them to their chicks. Regulators refused to certify them as safe, and eventually, airlines realized the approach could never work. They scrapped it and built mechanical airplanes instead, even though it required harder math and engineering. Stuart argues AI may require the same kind of reset: if current methods cannot be made safe, developers must go back to the drawing board.

For regulators, Stuart emphasizes that governments should not dictate designs but must set the bar for acceptable risk. He points to how regulators handle other high-stakes industries: airplanes must be orders of magnitude safer than cars, and nuclear power stations are required to demonstrate probabilities of catastrophic failure as low as once in ten million years. These standards evolved over time, initially set at one in ten thousand years when reactors were few, and then tightened as more plants were built. The crucial principle, he says, is separation of concerns: government defines the safety threshold, and manufacturers must prove their systems meet it. Their role is to protect the public by requiring rigorous proof before deploying dangerous technologies.

He also points out that developers are resisting red lines because they don’t know how to comply, and are even trying to sabotage discussions of such requirements. However, he insists, this cannot be the reason to lower standards. The argument that “we can’t have a requirement unless we know how to comply with it” may sound sensible, but in his view it’s a dangerous fallacy. The whole point of regulation is to prevent catastrophic risks. If companies cannot demonstrate how to keep systems safe, then they must pause development until they can. “We don’t want to die,” he concludes – and that is reason enough for governments to enforce requirements even in the absence of current compliance solutions.

Ultimately, he stresses the importance of mobilizing public awareness so that policymakers feel genuine pressure to take action. He cites the creation of the International Association for Safe and Ethical AI, which connects hundreds of organizations, as one step toward coordination. But he also notes the striking absence of constituent concern: even in highly educated districts, lawmakers rarely receive calls about AGI risk.

I really enjoyed this conversation with Stuart. His clarity on the stakes, and his insistence that governments and innovators face reality before it’s too late bring a rare sobriety to the AI debate. Catastrophic outcomes are not a foregone conclusion, but averting them will require resolve, foresight, and the willingness to set boundaries now.

Follow The Trajectory