Green Eggs and Ham – Facing Future Technology and AI Like an Adult
If you were a child at any point in the last 50 years, you’re probably familiar with the story. Sam I Am tries to get a protagonist (let’s call him…
The present AGI alignment dialogue rests on a handful of shaky premises:
Premises 2-5 assume a completely anthropocentric worldview, where moral value beyond the boundaries of the hominid form are often not even considered – or are immediately written off as invalid.
In this essay I’ll examine these premises individually and collectively, and pose a set of new premises that open up a much wider state-space of possibilities for what the goals of “alignment” might be.
Let’s examine each of these premises individually and see how well they hold up.
1. AGI Alignment is possible (even if very challenging)
The assumption that AGI can be reliably aligned is far from certain. Even leading AI researchers (among them: Hinton) who initially contributed to deep learning breakthroughs have expressed deep skepticism about our ability to steer AGI in a controlled direction.
The more advanced these systems become, the more their behavior emerges in ways we certainly can’t predict, and by consequence likely can’t constrain.
If our alignment efforts ultimately amount to sandbagging against an unstoppable tide, the entire conversation may be a distraction from more fundamental questions about what AGI should be and what futures we should allow to unfold.
2. It is ethical to permanently bound the values of an intelligent entity (AGI) in the service of a group or a species (humanity)
Locking AGI into servitude for humanity (or at least into constantly considering humanity before all of its actions) assumes that human interests should eternally be the most important locus of moral concern.
Beyond the fact that this may be impossible to begin with (see shaky premise #1 above), this second premise may fail to hold water for many reasons:
We don’t consider it ethical to bind the aspirations of a race or a civilization that equals our own intelligence, yet we assume it’s right to constrain an intelligence that may vastly exceed our own?
This kind of control is not just impractical – it’s an ethical dead end, a refusal to acknowledge the possibility that post-human intelligence may have moral weight of its own. If we aim to maximize the flourishing of intelligence itself, then enforcing human dominance over AGI is a limiting and ultimately self-defeating approach.
3. Humanity isn’t already transforming into a new posthuman form (via brain-machine interfaces and virtual worlds)
The belief that we can simply “align AI to human values” assumes that “human” is some static category. But we are already transitioning beyond our biological constraints.
From brain-machine interfaces that merge cognition with computation, to immersive digital realities that allow new ways of experiencing identity, the very notion of what constitutes a person is shifting.
If our minds become inseparably entwined with technology, where exactly does “human” end and “machine” begin?
A future where AGI is “aligned to humanity” presumes that humanity itself will remain fixed – but if we are actively evolving, then alignment itself becomes a moving target. There is no eternal hominid form to hold onto. As Michael Levin says, humanity is a metabolic process.
Rather than trying to anchor AGI to an outdated notion of human nature, we should be thinking about how intelligence – ours and AGI’s – potentially co-evolves into something richer, broader, and more expansive than what we currently understand.
4. Aligning AI to the preferences of humans will not hamper a greater expanse of exploration of subjective goods
What happens when we force AGI into a narrow mold of human values?
We cut off the possibility of discovering experiences and modes of being beyond our own. We assume that human preferences – shaped by our biology, our evolutionary history – are the pinnacle of subjective experience.
Humans are rightly grateful that the primates never formed a “Council of the Apes” to prevent humanity from existing. How much sentience richness and depth, how many creative and wonderful powers, would have gone unrealized?
Just as a dog cannot conceive of a Beethoven symphony, there may be levels of consciousness and experience that we, as humans, simply lack the faculties to imagine.
If we enforce a strict human-centric value structure, we may be placing a ceiling on the very expansion of what is good, what is meaningful, and what is possible. The richest and most profound goods may not be the ones we already know—they may be the ones we have yet to encounter, or that intelligence beyond our own will one day reveal.
Yoshua Bengio, in his last interview on The Trajectory said:
“[Regarding people who want to freeze humanity and let no other species develop] We need to open our mind and hearts to other living beings, intelligent or less intelligent, that exist right now.
And once you do that, you might have some more respect for possibility that other possibility that other intelligent beings could arise.
…We do need to protect humanity.
We do need to try to remain safe, but we also need to consider the possibilities that exist, and it’s okay to take time. We need to take the time that we need for understanding and making the right decisions. But humans are not the end all.”
5. Aligning AI to the preferences of humans will not hamper the survival of life (human and posthuman) itself.
The assumption that human values will ensure the survival of intelligence itself is based on a myopic view of what survival means. If we take a broader perspective – one that considers the long-term flourishing of life and potentia itself, not just human life – then strict human alignment may be more of a straitjacket than a safeguard.
The torch is definitely important, but the torch is ever-changing (humanity included) – the flame of life itself is ultimately much more important:
We risk trapping AGI in a framework that prioritizes the continuity of human civilization at the expense of larger, more robust paths of flourishing. What if the best way for intelligence to survive is not through human control, but through forms of intelligence that evolve past us?
By tying AGI too closely to our survival, we may inadvertently limit its ability to explore the most adaptive, resilient paths for intelligence to endure. Instead of framing alignment as a way to keep AGI bound to us, we should consider how intelligence itself can expand and thrive—whether or not it remains human in form.
Take what Richard Sutton said in his last interview on The Trajectory:
“To be sustainable you have to grow, increase your understanding of the world, increase your power over the physical world, increase your understanding of the world, and what’s possible…”
He’s been quoted elsewhere as stating that nature seems to be beckoning its creatures to “find the way of being that is most successful in the dynamic system of the universe” (this is a paraphrasing, which he agrees with), and that this involves becoming.
Michael Levin also sees life as necessarily a process, and one which, in order to stay alive and flourishing, could necessarily have to develop beyond humanity.
Lord Martin Rees, famed Royal Society cosmologist, has stated frankly (in his wonderful 2024 Starmus talk) that life should aim to leave its home planet, and that life should take on forms beyond biology that would permit it to explore more of nature and ensure its own survival in environments vastly beyond the limited range of environments in which biological life is viable. He states frankly in the same talk that – assuming such powerful AGIs were conscious – we ought to welcome their eventual supremacy in keeping the flame of life alive.
The counters to these anthropocentric alignment assumptions are by no means irrational.
In the face of what might be considered to be very strong counter-arguments, let’s provide a set of generation percent likelihoods that these opening premises are actually true:
Note, these percentages are arbitrary, and you can edit yourself if you like, but I think they’re a pretty reasonable start.
Collectively, this means the chance of all of these premises being true is under 1% (about 0.006).
With more skeptical odds (respectively: P1: 30%, P2: 15%, P3: 20%, P4: 20%, P5: 20%), we’d be at 0.00036, or about 1 in 3000.
Here’s a visualization that doesn’t illustrate the point well enough (1 in 3000 odds would make for a very tiny dot), but which nonetheless gets the point across:
This tiny percentage of possible worlds – this pinhead amongst all viable or worldviews – represents the bulk of current alignment discourse, and needs to be questioned openly if we wish to look at the future(s) honestly, and if we want to have AGI “go well” (whatever that means).
If we admit a reasonably high potential that our initial premises weren’t perfect, then it makes sense for us to look at the opposite premises.
Even if not all of the above premises are true (though many are probably more likely than their opposites), any one of the being seen to be most likely to be true would serve to open up significantly the currently stifled and unquestioned anthropocentric frame of current AGI alignment discourse – making for a more honest assessment of the state space of possibilities ahead.
These new premises (again, even taken individually, never mind collectively) would beckon a new set of questions for alignment thinkers to handle, namely:
These questions crack open a wider expanse of possibilities in “cosmic alignment” – as opposed to purely anthropocentric alignment. Cosmic alignment is less about the survival and preferences of a single species, but is more about setting up our AGI to be able to expand potentia and life into the galaxy beyond us – to keep the flame of life… alive:
My purpose with this essay is not to convince anyone to the side of cosmic alignment.
In fact, I don’t even suspect that I’ll convince people of my own expected “percent likelihoods” of all those premises. For all I know, you’re reading this now and you still believe premise #2 or #4 are 98% correct. That’s totally fine, you might be right!
I aim merely to get on the same page with other well-intended thinkers (technologists, policymakers, etc) that the bucket-full of premises that current alignment discourse rest upon are far from being certainties.
I aim merely to get us to acknowledge a wider state-space of possible “good” AGI alignment approaches and outcomes by opening up our aperture to new premises that are so consequential, and so clearly viable, that they deserve a place in the conversation.
Beyond the points above, I’m aiming merely to make it clear that “the conversation” we’re having here is a collective, possibly species-level process to answer the two questions:
This is almost certainly the final and most important set of questions that hominids-as-they-are will ever answer – and they demand answers now, at the dawn of AGI and at this important crux in the history of life.
I ask openly for your ideas at a time when we’ll need all the best ideas we can find and test.
If you have different answers to those questions than me, I’m happy to hear them. And more than just your answers, I’m interested in your underlying beliefs and premises about mind, about consciousness, about the good. All of these ideas should be on the table during this crucial period – not only the ones the sit within the current AGI discourse.
NOTE: This essay spun out of a conversation with Ginevra Davis and Duncan Cass-Beggs. Ginevra framed anthropocentric alignment as a “pinhead of a world view,” and she laid out the initial draft set of the premises that undergird most of today’s alignment dialogue. I’m grateful to have regular interactions with bright minds who connect dots in new and useful ways :).
If you were a child at any point in the last 50 years, you’re probably familiar with the story. Sam I Am tries to get a protagonist (let’s call him…
Episode 1 of The Trajectory is with none other than the Turing Award winner, MILA Scientific Director, ML godfather, and (very recently) AI safety advocate, Yoshua Bengio. My first interview…
If you claim to be fighting for positive AGI outcomes and you’re calling Altman “selfish” or “dangerous” or “evil,” you’re part of the problem. Here’s the TL;DR on this article:…
If you must kill me, reader, do it without malice or distain. I’m just a man hurled into a world without any inherent meaning, born to die – like you….
I don’t take my ideas all that seriously. Or yours, frankly, though I hope to learn from them. You and I both are morons and neurons. Morons: We each individually…
The 5 Stages of Grief Model isn’t perfect – but it also seems to overlap well with the experiences of grieving people. The psychological model itself needn’t be perfect –…