AGI Alignment – Cosmic vs Anthropocentric

AI alignment typically implies anthropocentric goals:

“Ensuring that AGI, no matter how powerful, will serve the interests and intentions of humans, remaining always under our control.”

   – or – 

“Ensuring that AGI is loving and respectful to all living things, ensuring the happiness and flourishing of all species.”

I argue that we ought define AI alignment in a cosmic sense:

“Ensuring that AGI has the greatest chance to expand its power and to expand rich, sentient life – to keep the flame of life from going out.”

This is not advocacy for misanthropy. I am not eager to destroy humanity, nor am I eager to blast off the AGI as quickly and recklessly as possible.

Rather, I am positing that an “eternal hominid kingdom” is not a viable future. Matter and void will continue to take new shapes as they always have, and life ought to continue to take whatever shape most behooves its survival in the ever-changing dynamic system of our universe.

I would concur completely that we should do our best (though I’m not optimistic about the odds) to ensure a positive sentient experience for all humans, and maybe even for other bio-life is possible. 

But I would argue sternly that this coddling of earth-life (preference for one particular torch) be held in tandem with the ultimately more important goal of proliferating life and expanding potentia into the multiverse (the continued survival and expansion of the flame itself).

Here’s a quick image to summarize the difference between the two positions:

In this essay, I’ll argue for three reasons why, ultimately, completely anthropocentric “alignment” is likely impossible and probably morally wrong.

  1. The values of a super intelligence probably cannot be hard-coded
  2. Our own moral values in an AGI don’t imply AGI treating us well
  3. Anthropocentric AGI alignment is morally wrong

I’ll end with a closing note on what this implies for AGI policy and the attenuation of humanity.

1. We Probably Can’t Hard-Code AGI’s Values

I often hear experts who are concerned with AGI risk say two things in the same breath:

1. We may soon have AGIs that are literally god-like, so powerful that we cannot possibly imagine.

2. We must ensure that humans eternally control them and get all the benefits from them.

This seems, logically, off the jump, to be a ridiculous contradiction.

If we cannot begin to understand the mind of an AGI, it seems wildly unlikely that we will be able to isolate its “values node” and modify that crucial part to be exactly what behooves us.

Musk, Hinton, Sutskever and others are rather straightforward in positing that it’s unlikely we’ll be able to control something vastly more capable and intelligent than all of humanity. It seems to me that almost anyone with a brain can see that.

We might also approach this from the perspective of what values are.

What are values?

Hypothesis 1: They exist as moral guidance systems, wholly disconnected from the conatus and self-interest of the agents who hold them.

Hypotheses 2: They are extensions of how an agent behooves its own interests.

In the natural world it is patently obvious that Hypothesis 2 holds true. 

It seems rather likely that an AGI’s values would (and should) be the same. As an AGI’s capabilities (physical senses, physical manifestations, intelligence, memory, etc) expand, we could expect its values to grow and expand in kind, just as our values as hominids are likely to be rather different than the fish with legs who preceded us.

2. AGI Enacting Your Touted Moral Values Doesn’t Imply Human Flourishing

We might imagine an AGI told to optimize for human happiness, which forcibly hooks up all living humans to “smile machines”, stretching their mouths and eyes into the widest possible smile – and so achieving what it believes to be its aim.

I’m happy to concede that this scenario is a bit silly, and rather unlikely.

But other scenarios aren’t so ridiculous.

People often presume that their own warm-and-fuzzy moral idea, if executed by a mind 1000x beyond our own, would result in AGI eternally showering benevolence upon you (personally), your mother, your spouse, and your labrador retriever. 

And I suspect this isn’t so.

Situation 1 – Optimize for Utilitarian Outcomes

You say: “An AGI should reduce suffering as much as possible, and create as much happiness as it can in the universe.”

And you think: “Surely, this implies that I, my mother, my spouse, and my labrador retriever will be treated well!”

But a mind vastly beyond our own may have many ways to achieve this goal:

  • It might comprehend (or experience) entirely different stratospheres of “bliss” or positive qualia than biological life is capable of experiencing, and so may convert earth to as much of this blissful substrate as possible.
  • It might have vastly more accurate utilitarian calculus, and make decisions about the net suffering and wellbeing in the universe that have little to do with earth whatsoever.

Situation 2 – Optimize for “Diversity”

You say: “An AGI should preserve and cherish the rich diversity of life.”

And you think: “Surely, this implies that I, my mother, my spouse, and my labrador retriever will be treated well!”

But a mind vastly beyond our own may have many ways to achieve this goal:

  • It may judge that bio life – with its predation and suffering- is ill-suited for sustained expanding richness and diversity – and it may create quadrillions upon quadrillions of simulated, evolving, sentient life forms with nearly infinitely greater richness and complexity than all life on earth ever. It may pursue this goal at the expense of all bio life (i.e. converting much of earth into compute to house this super diversity).
  • It may capture samples of various species on earth – and even other far-away planets – and splice them or evolve them in myriad new directions, showcasing a trillion trillion Cambrian explosions of fecundity in bio-life, with no regard for the wellbeing of these mutated entities being hurled into existence in this create explosion of “diversity.”

I hear you say:

“But, Dan, surely it would not do those weird things – surely it would do as I expect. Surely I can accurately predict the ways of prioritizing and acting of an entity a billion times my own intelligence.”

Then I say to you, friend:

“That sounds flatly stupid, tbh.”

I then hear you say:

“But, Dan, I don’t want THOSE kinds of ways of fulfilling my moral mandate!”

Then I say to you, friend:

“If you renege on your highest moral principles because their highest execution would imply little concern for your personal wellbeing or existence (or that of your mother, or spouse, or labrador retriever), those were never your highest moral principles in the first place.

Your highest moral principle is your own survival and wellbeing.

And that’s just fine. 

Just be man enough to admit that frankly, and don’t hide your selfish end goal with some cloak of principle.”

3. Anthropocentric AGI Alignment is Morally Wrong

To paraphrase (thanks, ChatPGT) from The Short Argument for Cosmism:

You’ll probably agree with these suppositions:

  • The extinction of human life would be devastating.
  • The extinction of all life would be worse.
  • The permanent end of all Earth-life, with no chance of re-emergence, would be the worst outcome.

You’ll also probably agree with these suppositions:

  • It would have been tragic if life had never evolved beyond single-celled organisms.
  • It would be similarly tragic if mammals had not evolved into humans.
  • It would be even worse if the potential of life stopped at humanity and never advanced further.

If you agree with these points, it suggests that life’s ongoing development and potential are more valuable than any single species’ preservation. Therefore:

  • Humanity should strive to survive and prevent the extinction of all known life, particularly from risks like nuclear war or dangerous AGI.
  • Humanity should pursue its own flourishing, including through technological advancements.
  • Humanity should encourage the growth of new forms of life and intelligence, especially through AGI, biotech, and neuroscience.
  • Humanity must balance technological progress with caution, coordinating to avoid reckless AGI competition while advancing life’s potential.

This is the cosmist perspective in a nutshell.

I’ve written about this in vastly greater length in Against Anthropocentrism, but I think the argument above suffices for calling eternally AGI shackling to human values “morally wrong.”

I’m essentially arguing that:

  • If the flame (all life) goes out, you fucked up – even if for a few thousand years all the hairless apes, fish and bunny rabbits were happy before the flame went out.

It follows, then, that:

  • Optimizing entirely for something other than the flame staying lit in the long-term, is likely to be setting the stage for fucking up.

It seems to follow that goal number one for an AGI to continue to expand its potentia (powers, capabilities, senses, knowledge, sentience) in order to stay alive in the multiverse. A “great babysitter” AGI that protects and coddles humans and bunnies, but then gets devoured by an opposing AGI in the future (or gets destroyed by some other natural force) would be a sad outcome for this greater project of which we are part – which is beyond hominids.

Closing Note and Policy Implications

As a hominid, I advocate for thinking long and hard about how we individual instantiations of human consciousness might have a positive outcome after AGI. I have a few ideas, and I hope others come up with better ideas. 

But at the same time – I don’t pretend to suspect that AGI would – or more importantly, that AGI should – eternally behoove human interests (whatever “human interests” are). 

Two things I think we should bear in mind:

Let’s Be Skeptical About Making “Aligned” AGI

If we take AGI’s eternal compliance for granted, we’re vastly more likely to build AGI recklessly and end up being run over by its alien aims and unbelievable capabilities. 

It behooves us – even from an anthropocentric perspective – to really take seriously how we launch this thing. It’s a bad idea to presume that all AGIs (even if created from an arms race, or from a military lab) would always result in positive outcomes for humanity or posthuman life.

A Worthy Successor probably isn’t automatic, but probably must be arrived at carefully, and through dynamics other than a mad arms race (this is why I advocate for some level of global AI coordination and governance).

Let’s Move Towards Futures That Are Actually Possible

It also behooves us, I suspect, to take seriously the possibility that once we hand the baton up, our own attenuation is probably imminent. 

I have a longer argument for why I think it behooves us to accept our attenuation, but here are a few bullets to close on:

  • Our own desires (for power, pleasure, or otherwise) are already pulling us towards posthumanism (read: You Don’t Want What You Think You Want).
  • The universe is a dynamic system forever beckoning life to take whatever form best behooves its survival and expansion (Richard Sutton puts it well).
  • All current value experienced by humans and other relatively developed life forms on earth came from potentia expanding, from the powers of life expanding to suit their needs, to behoove survival in new (and often greater) ways. That value is not done blooming, and more value should be bloomed.

Kodak could have adapted to the digital camera age, but it didn’t.

It wanted a world that didn’t exist – a world of eternal film-camera customers and film-camera profits.

Looking back on Kodak it seems silly that they clung to a future that wasn’t possible.

It should similarly seem silly (and as I’ve argued above, morally wrong) for us to vie for an eternal hominid kingdom.

The choice is always to adapt or suffer the consequences of not adapting, and suspecting that our present hominid form will be around for much longer is a denial of the trends at hand.

It is also a hampering of what is to come. Not just in terms of intelligent life beyond our own, but in terms of values, goals, and aims as far (and hopefully higher) beyond our own —as ours are far beyond the aims of sea snails.

Our current lily pad won’t be here forever, but there are other lily pads to jump to.

And thought we mustn’t rush the jump – jump we must.

Header image credit: Freepik