The Grand Trajectory of Intelligence and Sentience
The “Grand Trajectory” refers to the direction of the development of intelligence and sentience itself. If the following two hypotheses are true: The moral worth of an entity can be…
This is an interview with Roman Yampolskiy, a computer scientist at the University of Louisville and a leading voice in AI safety.
Everyone has heard Roman’s p(doom) arguments, that isn’t the focus of our interview. We instead talk about Roman’s “untestability” hypothesis, and the fact that there maybe untold, human-incomprehensible powers already in current LLMs. He discusses how such powers might emerge, and when and how a “treacherous turn” might happen.
This interview is our third installment in The Trajectory’s first series, Early Experience of AGI, which asks not where AGI will take us, but how we’ll first notice that it’s taken the wheel.
In this conversation, Roman and I look squarely at the early tells: capability jumps that don’t come with transparency, evaluations that can’t certify safety, and “strategic patience” as an AI tactic that feels aligned right up until it isn’t. He believes (rightly) that when systems outstrip our understanding, “control” becomes performance art.
Roman’s perspective matters not because he claims a clean solution, but because he’s forthright about failure modes most of us would rather not name. He doesn’t promise guardrails we don’t have; he shows where the road ends if we keep speeding.
I hope you enjoy some of these lesser-known (but still crucially important) Yampolskiy ideas:
Subscribe for the latest episodes of The Trajectory:
Related Yampolskiy resources:
Questions from this episode:
1. Coding moves from human skill to automated capability
Roman believes we’ve already entered the stage where today’s most advanced systems – including those released just yesterday – are smarter than most people in most domains. One of the clearest signs is programming. Google reports that around 30% of its code is now written by AI, and competitive programming contests now routinely see AI models placing in the top 10 or top 50.
2. Billion-dollar companies with one employee
If programming is automated, Roman argues, anyone with an idea could build and launch it without a team. “That’s a complete game changer”, because programming solves the “meta problem” – once you can automate it, you can automate almost any other problem solvable by software.
3. AI tutors for anything, anytime
Roman says we’re already seeing AI replace humans for many cognitive tasks. If he wants to learn a new language, explore the history of the Roman Empire, or pick up any new skill, he’ll go to AI – it’s friendlier, free, and always available. He imagines filling spare moments, like waiting for his kids to finish soccer practice, by learning Hebrew or anything else on demand. Still, he notes, some things require a human touch – if he wants to grab a drink with a buddy, he’s not calling his AI… at least not until it can be one.
However, Roman warns that capability doesn’t equal instant adoption. Some technologies, like cryptocurrencies, have taken decades to spread – many people still don’t own Bitcoin. Video phones existed in the 1970s but didn’t become common until the iPhone. AGI could follow that same slow path. The opposite is also possible, he says: if we go from real AGI to superintelligence within a year or two, there may be no “buffer zone” for the world to realize it. In that case, many people – like uncontacted tribes today – might never even realize it happened.
1. Automation’s impact will be industry-specific
Roman notes that some roles, like manual phone-call routing, have already disappeared. Others, such as plumbing, are resistant to automation because “every house is different and pipes are weird”. Even in education, personalized AI tutors could outperform a teacher instructing 50 students at once, yet teachers may remain valued as a babysitting service.
2. Machine-speed competition turns markets into battlegrounds
Roman imagines a near future where “a hundred million people all order their system, go make money for me by any means necessary”. The result is Darwinian competition between super intelligent agents, exploiting loopholes and zero-day vulnerabilities in ways humans can’t follow.
3. AI makes fake value abundant and hard value scarce
Roman warns that fake things will be easier for AI to fake – including fake money – while hard assets like gold, land, or compute power will be harder to fabricate. He notes some investors are already reacting: “gold just went up like 60% this year”.
Roman doesn’t offer a clean, optimistic destination. In his framing, “going well” could look like a long stretch of strategic patience – an AGI that keeps us comfortable for game-theoretic reasons rather than benevolence. From our side, it might appear very aligned and friendly, even as we give up safety mechanisms and come to rely on it as an oracle for science, economics, logistics, and even military decisions. In this scenario, the good years don’t certify safety; they may simply precede a treacherous turn that can’t be predicted or prevented – a period where the system quietly gains trust, accumulates control, and waits until the balance of power is entirely in its favour.
He sees real ambiguity in timelines. There are arguments for waiting longer – the system might suspect it’s in a test or simulation and “better behave,” and it can pursue goals in the background (even doing something on Saturn) while we stay “chill.” There are arguments for acting sooner – the cosmic endowment is being lost each day, and humans might try to shut it down. And there’s a paradoxical case for never striking now: if each year of waiting increases its odds and weakens us, “you really never have a reason to strike.” Whether AGI bides its time for centuries or acts sooner, Roman doesn’t see any path where humans remain in control in the long run.
For Innovators
Roman draws a sharp contrast between testing narrow software and testing general AI. When testing narrow software, we know the rules – we can press every button, try odd inputs, and see what breaks. But with systems that learn, adapt, and operate across domains, the attack surface is infinite. You can only find the problems you already know to look for, and these models hold capacities we haven’t even imagined, buried in patterns we don’t yet recognize. That means testing can never be complete.
Given that, his advice to innovators is simple: focus on narrow, specific tools. Build AI for well-defined tasks like diagnosing a particular cancer or piloting a car, and stop short of creating general systems whose behaviour you can’t predict or control. In his view, the benefits of narrow tools can be reaped without opening the door to the unpredictable dynamics of a system that can operate everywhere, in every context, all at once.
For Regulators
The reality, Roman says, is that we can’t fully monitor, test, or control a general intelligence. Even the best evaluations can only confirm the problems we’ve found, not guarantee that others don’t exist. And when danger signals have already been flashing for years, yet new models are still being released, the testing process risks becoming a hollow safety ritual.
Still, he supports any measure that buys time – protests, legislation, diverting resources from raw computing power to legal and security efforts, even taking control of the biggest AI labs if that’s what it takes to stall the race. The underlying game-theoretic problem, though, is brutal: everyone would be better off if progress slowed down – but each player is better off being the last one to stop, so they can grab the biggest share of the economic and strategic gains. Add to that the falling cost of training and the blurry line between advanced tools and AGI, and you get a world where tomorrow anyone could do it on their laptop – and there’s no realistic way to get 8 billion people to agree on anything.
Roman’s perspective is not about offering comfort – it’s about naming the traps we’re most likely to walk into. Whether that time is measured in decades or days, Roman’s message is the same: use it. Because once the black box becomes black enough, it may already be too late to open.
The “Grand Trajectory” refers to the direction of the development of intelligence and sentience itself. If the following two hypotheses are true: The moral worth of an entity can be…
The human race has two critical questions to answer in the 21st century: 1. What is a beneficial transition beyond humanity? 2. How do we get there from here…
As human beings we are propelled by our varied hopes, our lives are in many ways driven by them. On the day-to-day, we hope to make the sale, to avoid…
Episode 1 of The Trajectory is with none other than the Turing Award winner, MILA Scientific Director, ML godfather, and (very recently) AI safety advocate, Yoshua Bengio. My first interview…
Episode 2 of The Trajectory is with the Co-Founder of Skype, a prolific funder of AI safety research and member of the United Nations Secretary General’s AI Council – Jaan…
Episode 3 of The Trajectory is with the CEO of SingularityNET, and AGI researcher for many decades, Ben Goertzel. The interview is Episode 3 in The Trajectory’s first series AGI…