Scott Aaronson – AGI That Evolves Our Values Without Replacing Them (Worthy Successor, Episode 4)

This week Scott Aaronson joins us on The Trajectory for episode 4 of the Worthy Successor series.

Scott is a theoretical computer scientist and Schlumberger Centennial Chair of Computer Science at the University of Texas at Austin who recently completed a year-long stint as an AGI researcher with OpenAI. I was influenced to invite Scott to the program after seeing his TEDx talk The Problem with Human Specialness in the Age of AI, and after Jaan Tallinn recommended Scott as a thinker worth following.

In this episode Scott shares perspective on why AGI should evolve, not replace our existing human values. He discusses the possibility of a kind of “moral bedrock” that humans may already have access to, and which AGIs might expand upon.

I hope you enjoy this conversation with Scott Aaronson:

Below, we’ll explore the core take-aways from the interview with Scott, including his list of Worthy Successor criteria, and his ideas about how to best leverage governance to improve the likelihood that whatever we create is, in fact, worthy.

Scott Aaronson’s Worthy Successor Criteria

1. Their values and likes have some continuity with ours.

Posthuman intelligences may be vast extensions from our human preferences, but such human values and preferences are still in some way present.

Their moral values have evolved from ours by some continuous process (rather than displacing said values entirely).

2. It would be conscious.

It would carry the flame of awareness / qualia on a new AGI torch.

Regulation / Innovation Considerations

1. We should try to determine if AI is conscious.

We could scrub all training data of any mention of consciousness and see if the model is able to articulate what awareness is like, regardless.

2. Plan for some degree of global AGI governance coordination.

(Scott doesn’t mention any specific form of governance in this series, but he advocates that it should be created based on the enormity of what’s being created – an event potentially more consequential than anything else on earth.)

3. Potentially: Government approval for frontier model development.

Models requiring beyond some determined level of FLOPs might require some kind of registration process with a governing body – where it might be tested across a variety of criteria to determine risk.

Concluding Notes

I appreciated Scott’s firm emphasis on:

  • Avoiding a purely speciesist (anthropocentric for no other reason than being anthropocentric) grounding of morality,
  • Thinking in an nuanced way about AGI governance and avoiding black-or-white takes (“accelerationist” or “doomer”), and
  • His openness to discuss the necessity of AGI global coordination – including lines of reasoning very similar to what we’ve seen expressed by Hendrycks, Bengio, and others.

On the topic of morality and “values,” I concur with Scott that – especially initially – it would be important for AGI to branch off from our own human values, rather than boot up a new set of them entirely. Any kind of wholesale replacement risks losing not only things that are “uniquely human,” but things that might be adaptive and useful for future life in general – a sentiment that Bostrom shared in his Worthy Successor interview here on The Trajectory.

Regarding Scott’s notion that intelligence may “cap out” at moral ideas similar to those of humans, I disagree completely, as I suspect that a mind a billion times beyond our own, working on problems vastly beyond our own, housed in substrates and manifested in embodied forms vastly beyond our own, would likely have values wholly alien to us. I suspect that thinking AGI would – with any reasonable likelihood – converge on human values or human-friendly values is probably a dangerous idea.

All that said, Scott’s point about there being a kind of “moral bedrock” may have credence, and I suspect time will tell. Given Scott’s insistence on seeing things as they are, I’d guess that his perspectives will evolve as new experiments come in. As will mine.

What did you think of this episode with Scott?

Drop your comments on the YouTube video and let me know.

Follow the Trajectory