Green Eggs and Ham – Facing Future Technology and AI Like an Adult
If you were a child at any point in the last 50 years, you’re probably familiar with the story. Sam I Am tries to get a protagonist (let’s call him…
Any honest AGI thinkers are frank about the fact that we can’t possibly predict all of the actions or ideas from a posthuman intelligence vastly beyond ourselves.
While it seems reasonable to suspect that humans may be able to shape AGI’s values early on, very few thinkers suspect (in fact, many believe the opposite: Hinton, Musk, etc).
But… will AGI be kind to humans?
There are many brilliant and good-faith thinkers who argue that – even if we can’t predict everything an AGI will do – its greater intelligence will imply greater kindness.
Humans are kinder than most animals, smarter people are kinder and less violent on average – or so the argument goes. More intelligent entities are more kind to other entities.
Many of these thinkers argue that we’ll naturally see this trend extend upward to AGI – with a kind of inevitable benevolence shown to humans and other life by a machine whose intelligence and kindness have naturally ascended together.
In this article, I’ll explore three alternative hypotheses about the origin of kindness, and I’ll argue why the common “emergent selflessness” hypothesis is not only likely wrong, but is likely extremely dangerous to assume as we create entities with vastly more power than humanity (AGI).
To briefly explain the graphic above:
To analyze the positions above:
Visualized another way:
The assumption of “emergent selflessness” makes preventing an AGI arms race less important. It makes governance less important. “If it arrives, it’ll love and care for us all” – or so the argument goes.
But when the risk is the extinction of humanity – or even all of earth-life – it behooves us not to lean on moral “hunches,” but to rationally assess whether or not AGI is likely to treat humanity well.
Lets assess these “origin of kindness” hypotheses below in more detail:
Origin of Kindness Hypothesis 1: Emergent Selflessness – “Kindness emerges in more intelligent animals (and people) because after a certain threshold of intelligence, entities begin to act selflessly.”
Conclusion 1: AGI Benevolence – “AGI will be selfless and kind, or will at least be sure to not harm humans.”
Derived conclusions 1:
The premise of this hypothesis is basically:
Occasionally there is a “scarcity” clause posited, which makes the premise a bit longer:
This is could be argued to be blatantly false for a number of reasons:
People will likely write this off as being due to our conditions of scarcity. But if one’s “selflessness” evaporates when the chips are down, I know not how “selfless” it was to begin with.
Let’s look at a handful of even more damning evidence against the emergent selflessness motive:
While these examples don’t deny that no genuinely selfless or altruistic actions, they certainly cast doubt upon the idea that most (or even many) of these actions are genuinely selfless.
Origin of Kindness Hypothesis 2: The Conatus – “Kindness emerges in more intelligent animals (and people) because intelligence permits for more pathways to obtain the entity’s self-interested goals, and cooperation happens to occasionally be one of these pathways.”
Conclusion 2: AGI Self-Interest – “AGI would likely cooperate with humanity only when it behooved the interest of the AGI itself. As with humans, AGI would use kindness as something instrumental to its goals.”
Derived conclusions 2:
The myth of emergent selflessness doesn’t hold water.
But self-interest very much does.
Incentives rule the world.
Spinoza’s idea of the conatus could be defined as: An innate inclination of a thing to continue to exist and enhance itself.
Psychological egoism is the belief that agents act solely out of perceived self-interest at all times.
While we may believe that sometimes people or animals act “selflessly” – in our day-to-day lives we act as if self-interest rules:
Altruism refers to behavior by an individual that increases the fitness of another individual (recipient) while decreasing the fitness of the actor (source). But where are the examples of this magical kind of action?
It is patently obvious that much of human cooperation spawns from a greater win-win situation occurring through peace and concord than by force or deceit. Successful societies structure the incentives (through norms and laws) to make it so. It isn’t that these “civilized” humans are genuinely kinder than their cave-dwelling ancestors 30,000 years ago. It’s that it behooves us more to cooperate when we have running water and cushy remote jobs than when we were all fighting over the same handful of antelopes to feed our respective tribes.
We act as if self-interest rules because, largely (and by a wide margin), it does.
While I won’t venture to say that psychological egoism is necessarily “true” (it is a circular argument) – it seems remarkably clear that the best way to predict what a person, animal, organism, or organization will do is to determine what is in its own perceived self-interest.
The evidence for genuine altruism (actions that are not self-interested to oneself or ones genetic lineage) seems remarkably scarce at best.
An animal that acts in a genuinely selfless way flushes itself out of the gene pool.
While evidence of the conatus (and lack of evidence of altruism in nature) doesn’t prove that AGI will act in self-interested ways, it certainly casts doubt on the idea of inevitable AGI benevolence.
And doubt can be useful.
The most honest position about the origins of kindness, and the future actions and values of AGI, is uncertainty.
I do not suspect that AGI all would be detrimental to humanity (never mind malicious) – and even if I do believe AGI is likely to lead to the end of humanity – I also believe that if it goes will it will keep the torch of life lit well beyond earth, and carry on doing vastly more important things than we could have ever done.
But it seems self-evident that we have little clue what an AGI would do, or how it would value or prioritize.
From Ilya to Hinton to Bengio, many leading AI thinkers are extremely skeptical that AGI could ever be hard-coded into an eternally human-centric set of values. Many AI experts also believe that if AGI is indifferent to humanity, that might cause our extinction just as easily as if it was malicious.
I don’t ask you, reader, to “pick a side.” I think it’s pretty logical to presume that “it’s complicated,” and that there may not be one blanket hypotheses that covers the appearance of all kindness in all possible minds – especially in AGI minds that have yet to be built and will be even more beyond our comprehension.
If we can admit:
Then we can also admit:
Here’s the most honest hypothesis:
Origin of Kindness Hypothesis 3: Uncertainty – “We can’t be sure if altruism exists in nature at all – cooperation and competition are very complex things. If we’re honest, we don’t know if it’s possible to hard-code a machine to act entirely selflessly.”
Conclusion 3: AGI Unpredictability – “It isn’t clear if selflessness exists at all, and we can’t know for sure if all actions are self-interested. Our best understanding of human motives are still just hypotheses – and so our best understanding of (as yet non-existent) AGI is even more unknown.”
Derived conclusions 3:
Uncertainty is itself a valuable position here. Within uncertainty we might ask: “What is the percent likelihood that we think AGI would inherently treat humans well?”
Intelligence is complex, consciousness is complex, and the motives of living things are complex.
Embracing the fact that we don’t know about how an AGI will cooperate or compete gives means not building AGI with any “faith” in all possible machines being eternally benevolent to man.
It means being careful to make sure – as much as we can – that this “build something vastly powerful than yourself” thing goes well.
And that’s a good thing, because it’s the most important – and possibly last – thing we’ll ever do.
If you were a child at any point in the last 50 years, you’re probably familiar with the story. Sam I Am tries to get a protagonist (let’s call him…
Episode 1 of The Trajectory is with none other than the Turing Award winner, MILA Scientific Director, ML godfather, and (very recently) AI safety advocate, Yoshua Bengio. My first interview…
If you claim to be fighting for positive AGI outcomes and you’re calling Altman “selfish” or “dangerous” or “evil,” you’re part of the problem. Here’s the TL;DR on this article:…
If you must kill me, reader, do it without malice or distain. I’m just a man hurled into a world without any inherent meaning, born to die – like you….
I don’t take my ideas all that seriously. Or yours, frankly, though I hope to learn from them. You and I both are morons and neurons. Morons: We each individually…
The 5 Stages of Grief Model isn’t perfect – but it also seems to overlap well with the experiences of grieving people. The psychological model itself needn’t be perfect –…