This past week I posted the following on Facebook, in reference to my article Where Should Humanity Steer Sentience?:

If utilitronium (a conscious, super-blissful substrate) could simultaneously explore future kinds of “the good” (beyond utilitarianism), and consistently bolster its own strength and capabilities (rather than merely “tiling the universe” with one uniform kind of bliss-hardware), then it would be hard to argue against creatin it and only it.

The original post was longer (full post here), but it more-or-less got to the crux of my cause, and my belief that future intelligences (augmented humans and/or AGI) should be created to explore the good itself, in addition to optimizing for utilitarian good (i.e. positive qualia).

Michael Johnson of the Qualia Research Institute is one of maybe half-a-dozen people I know of who have given ardent thought to the idea of optimizing the universe – that is to say – “doing good” in the highest possible, conceivable way – by ensuring that the trajectory of intelligence ensures the proliferation of the good.

He left a lengthy comment on my post about this topic of steering the future well, and it was right enough for me to not only want to respond but to use it as a jump-off point for future discussions.

Michael’s comment is indented and italicized, my comments are interspersed below, unindented and without italics. This post is broken out into two threads:

Thread 1

Dan, I really respect how you always cut to the important stuff. I can’t think of any questions more important to get definitive answers to. These are also really hard questions, which have consumed many careers, often with little to show for it.

I have a saying that “never do metaphysics when you can do physics” — my intuition is that we really need to synthesize this topic into a process where we can put energy in and reliably get progress out.

My expectation is this would look something like the following:

    • First, make a typology of scenarios about what ‘the good’ is — something flowchart-y which essentializes things into binary yes/no scenarios.
    • Second, figure out what evidence would be for each scenario — what would increase our confidence that *this* branch is more likely and *that* branch is less likely?
    • Third, actually collect that evidence and see where it points.
    • Naturally, the last step is to use our resulting knowledge about the good to optimize the universe.

But this [last step] is a “measure thrice, cut once” sort of thing and I don’t like to focus on it given our fairly meager level of knowledge. More generally, I worry that it’s much easier to destroy value than create value, and my current aesthetic here is I believe we should treat our cosmic endowment, and the anti-dissipative structures inherited from millions of years of evolution, with great reverence. “A lot of suffering went into creating this thing that sorta works. It may feel like it’s easy to improve upon, and making something perfect is in fact our core goal, but let’s be delicate about remaking ourselves and the structures which support us, for the state-space is dark and full of terrors.”.

.

I agree with the general order of your steps, and I hope that our approaches to mapping our future options are fruitful, we’ll have to understand not only consciousness, but game theory/power dynamics, relative risks/emergent risks… it’s a lot to take in when carefully considering the long-term. I also agree that it must be easier to destroy value than build it. We’re on a nice little local maximum of brainpower and relative control over our environment – let’s appreciate that and tread cautiously.

I think the trajectory must continue, and will with or without us, but that we should almost certainly pause and understand our options before throwing our ingredients into the cauldron of the future and stirring violently.

I’ve argued that this pause will probably have to be a global one, and that the nations of the world will have to get on the same page about the big two questions (what are we turning into?… and how?) in order to do anything other than an arms race.

Anyway, to offer some themes for the flow-chart about ‘the good’, we could start with:

  • Does ‘the good’ exist? If no, exit at nihilism; if yes, continue
  • Are there many different (divergent/orthogonal) kinds of goodness, or is it essentially unitary/convergent?
    • It’s been said there are only three numbers in physics: zero, one, and infinity. This could suggest there being three scenarios here: the good does not exist, there is one dimension of goodness, or there are infinite dimensions of the good
  • If there are many different dimensions of goodness, in what senses are they orthogonal (can be optimized independently) vs conflicting?
  • If ‘the good’ exists, this implies that things can be ‘more good’ or ‘less good’. But does ‘perfect’ exist?
  • Should we look at atoms or bits in determining whether something is ‘good’?
  • Is consciousness a natural kind?
  • Is consciousness where ‘goodness’ and ‘badness’ live?
  • Is valence a dimension of the good?
  • Is valence the only dimension of the good? (philosophically speaking, not logistically speaking — lotus eaters die out)

The “moral” good may always be what arbitrarily behooves the observer. It may always be whatever benefits the entity judging or valuing things. I’m not sure if that leaves us inherently with nihilism. Pain-pleasure seems to break from this arbitrariness because we can (presumably) all agree that pain is not preferable and pleasure is preferable. That said, there may be many, many other kinds of axes of “goods”, none of them with some eternal claim to being “highest”, and again I’m not sure that leaves us with nihilism.

This seems to be in line with your “infinite kinds of goodness” scenario – but I’m not closed off to the idea that there may eventually be some great and convergent singular good.

Even with that slippery, arbitrary grounding of “infinite goods” (beyond even consciousness, possibly) the aim could be exploring kinds of minds and the kinds of goods they could experience, to find useful, pleasurable, new, higher goods. I suspect the only way to answer the important questions you’ve asked above is with intelligence vastly beyond human level. That said, I agree that we should make what progress we can in our hominid form before we blast off into handing the baton of “good exploration” to something beyond us.

I’m flattered that you used lotus-eaters – though I suppose Homer gets the credit on that.

The keys would be to:

    1. Split this into cleanly differentiated scenarios which cover all logical possibilities;
    2. At every point, ask “what would evidence for this look like? what would evidence against this look like?” — I think this is really an underused frame in philosophical debates and people should be asking their conversation partners, and themselves, this question all the time.
If we made a good, principled flow-chart, we should be able to identify the positions of key thinkers and organizations on this chart. E.g., perhaps we could locate Bostrom as:
    • The good exists
    • There are many dimensions of the good
    • We should look at bits to find the good, not atoms
If we made a full flow-chart, we could also identify gaps — e.g., ‘there’s a spot in the possibility space for the nature of the goodness *here* where no organization is doing research. Maybe someone should look into what we should do in this scenario, in case it turns out to be true.’
My provisional positions:
    • The good exists
    • The good is unitary
    • We should look at atoms to find the good, not bits
    • Consciousness is a natural kind
    • Consciousness is the home of value

I really like this idea of the flow-chart… though the ontology required to create one would be massively challenging. Similarly, I don’t know if collecting the ideas of smart thinkers would point to truth, but maybe it could give us hints on where to start, and arguments for or against different scenarios.

It might also be useful to think about objectives, where we’re headed. I list 8 options here in the Steering Sentience article, but I suspect that more variants could be developed and fleshed out, and we might have a flow-chart of qualia, and of end-goals, and of power dynamics, and of other things. By golly, this would get complicated, but hopefully, it’s a worthwhile aim.

I’ll try to do mine.

  • The good exists
  • There are many dimensions of the good
  • Positive qualia is as important a moral idea as we have today
  • We should pursue the optimization of positive qualia and the exploration of good itself, while ensuring that intelligence becomes more sustainable and able to survive, not merely more blissful and more intellectually capable/understanding (roughy Steering Sentience Option 8 [see link above])

I don’t know enough about “natural kinds” to make a call there. And frankly, I suspect that there are realms beyond consciousness. Here’s a quote from a previously referenced essay:

Consciousness as one arbitrary checkpoint in life’s development. Once there were just proteins. Then cells. Then multicellular organisms. Then organisms with sensory organs. Then muscles. Then consciousness (though when consciousness arose is less certain than the other steps). So now this morally worthy “stuff” (consciousness) has emerged. Isn’t it plausible to suppose that there is another blooming of “stuff” beyond consciousness? At a higher level of complexity or brainpower or in other brian substrates, we should expect that new vistas of morally relevant “stuff” will bloom forth – can we really argue that such a thing won’t happen? Isn’t it plausible that such a thing is in fact likely, given enough time and development? Optimizing for the utilitarian good would then be wasteful, as it would prevent us from optimizing for higher kinds of good.

I might be wrong there – but I don’t think anyone can write off the possibility of there being higher things than consciousness, maybe that we can’t explain or imagine, which are vastly more morally relevant than it. The argument that we can’t imagine such a thing doesn’t seem valid – because there are arguments (like the one I list above) that such a thing is plausible. Crickets couldn’t imagine freight trains or Marxism or prenuptial agreements or sonnets, yet all four of those things exist today.

Is valence the singular dimension of intrinsic value? I don’t know. I think it’s a more solid feature of the world to hang the wreath on than most other contenders.In PQ I note “I am not claiming valence is necessarily the only quale of moral relevance, merely that it is the quale that is most obviously morally relevant.” — today, I think I’m forced into saying that provisionally, it does seem most plausible that valence is the core intrinsic good — it is the only non-philosophically-terrible theory of intrinsic goodness we have, and with enough neuroscience, evolutionary biology, and complexity theory, you can rederive almost any other existing theory of the good in terms of valence. I think this gets even more philosophically palatable when we bring in STV; symmetry is crazy-important in physics, and if the symmetry-valence correspondence I put forward in PQ holds, it makes sense that valence would hold some special position in the pantheon of qualia.

I suspect there will be advancements in understanding the nature of goodness that surpass our current understanding — but that these new forms of understanding will still rhyme with valence. By analogy, it’s likely there will be theoretical breakthroughs in understanding the nature of mathematical symmetry… but the new models of symmetry will still rhyme with our current understanding. In 10,000 years we might be discussing the exact mathematics of hypersymmetry in the E8 variant that describes our phenomenology.

.

I agree that – at present – it’s the best tool we’ve got – it has so many great qualities.

I have much less confidence that all future moral ways or modes will “rhyme” with it. I think we can’t imagine beyond consciousness, but – per my previous quote – I think that future intelligences will be able to experience and conceive of higher kinds of experiences. Maybe they will be gradients of sentience that we don’t understand, but maybe they will be meta-sentience, or some other kind of experience that almost doesn’t tie to valence or consciousness-as-we-know-it at all. I agree about its current value – but not that it will hold in the long-term. It might, but I suspect it is just as likely to not.

But in the end, symmetry will still be central in physics, valence will still central in value, and STV will still be true. But I hold my door open for other contenders and I think entertaining new hypotheses is the best way to ensure I’m not just fooling myself. There are likely gaping ‘unknown unknowns’ out there. I *love* your phrase “that our hominid brains can’t understand but are borderline obligated to explore”.

It also feels significant to offer the substantial caveat that pleasure may in fact be a poor optimization target; e.g. today it seems a truism that chasing happiness is a poor way to maximize happiness; optimizing meaning and health is a much better strategy. Perhaps there exists some translation of this observation for societies as well.

At any rate, and at the risk of repeating myself, I think we do need to approach questions about the good as a philosophy of science task — how do we turn these questions into a process that will generate results? What are the possible positions one could hold about the good, and what would evidence look like for each position?

.

Do you also suspect that this should be a united global effort – akin to what the United Nations does? (see: The Sustainable Development Goals of Strong AI)

Thread 2

I think we’re basically in agreement about the structure of the problem; we just differ on some of the probabilities we assign various scenarios.

It feels the real question is not that it’s crazy-important to figure these questions out, but rather how do we go about doing it in the real world. I’m skipping around a bit on your post, but some late-night thoughts:

“The “moral” good may always be what arbitrarily behooves the observer. It may always be whatever benefits the entity judging or valuing things. I’m not sure if that leaves us inherently with nihilism.”

Maybe! But, to shift the discussion a little bit from morality to truth, I also think in the case of ‘value research’ we should assume a more objective answer exists, and try to find it. We can always give up if we fail — but, the scenario where the answer exists and we fail to even systematically look is the actual sad one. Speaking from inside-view, I think QRI is ‘doing the thing’ here and gathering evidence that valence is real, it’s not just some preference-based hall-of-mirrors, and I hope we can get more people paying attention to our results.

I think efforts like yours should continue, and I hope we can pick up on more of the “stuff” or morality. I’m not sure there’s a bedrock to stand on – or at least any kind of bedrock that won’t be obliterated by the next order-of-magnitude increase in intelligence, but any swings at grasping consciousness is more than worthwhile in my opinion.

“Do you also suspect that this should be a united global effort – akin to what the United Nations does? (see: The Sustainable Development Goals of Strong AI)”

I’m a little cynical that coordination-by-committee, which I think a UN-style effort would essentially be, tends toward dysfunction on issues involving incentives, zero-sum situations such as those involving power, and novel scenarios which require substantial vision to navigate. Similarly — I think the idea behind the Asilomar Principles for AI is beautiful, but the principles themselves do not feel ‘crisp’ or ‘realpolitik-aware’ enough to coordinate around. I don’t have a positive answer here (and, I fear, most players feel the same). I do think a lot of bright people are working hard in this area and I hope to be pleasantly surprised by good things happening.

Beyond that, the following threads seem really significant in how the world navigates these topics:

As you note:

“I think the trajectory must continue, and will with or without us, but that we should almost certainly pause and understand our options before throwing our ingredients into the cauldron of the future and stirring violently.”

I TOTALLY agree everything here, that we’re racing but we don’t know what we should be racing toward. But we need to be realistic about such a pause. In a multipolar world, which we seem to be in, the logic of competition and defection means we cannot, realistically, *expect* a pause. So — it’s a race.

I agree, “pause” isn’t the right word. “Approach with caution” might do better. Or “get everyone on the same page about how dangerous this is.” But I agree there’s no pause, not even if huge swath’s of earth’s population wanted one. The benefits of plowing ahead – and the forces that keep that plow driving forward – seem far to strong. “Influencing the trajectory” is the highest hope we have.

Though I suspect that the game theory non-zero sum stuff may ultimately win out, and that the last words of Alexander will hold as true in the future as it has in the past.

As I noted we do need a process whereby we can put effort in and get clarity about ‘what is goodness’ out. We have a few options, e.g.:

    • Build AGI and let that figure it out;
    • Build the flowchart and do the science to gather evidence;
    • Go with the current best option, valence.

I’m generally in favor of preserving optionality — my ultimate commitment is to build something perfect. I’m offended the universe is not perfect, and I feel it’s humanity’s (or at least my) duty to help. I don’t know what’s perfect, and I’m focusing on valence because I think that’s the most plausible ultimate answer we have. I’ll be happy to update my target given further knowledge; preserving our ability to update seems really important.

But also — valence is a really philosophically defensible target. The major downside is, frankly speaking, it’s low status to be a lotus-eater (love the term and I did get it from you).

It’s low status for really good game-theoretic reasons: societies which uphold values other than pleasure have tended to be vastly more successful. A theory being low-status doesn’t mean it’s philosophically incorrect however, and if logically it seems solid, it being low status might actually be something we should control for by upping our estimate it’s correct (since we humans prefer high-status theories). Complicated topic I guess — but I think reverse-engineering valence is a thread that’s *really worth putting serious resources into pulling.*

Two points about lotus-eating being low status.

First, I think it’s true in a real sense. In Lotus Eaters and World Eaters, I argue that lotus-eaters are looking to escape the state of nature via digital means, while world eaters compete even more powerfully in the new, digital/physical state of nature. Those who optimize for pleasure along with never own the substrate that houseses other human experience, or powerful AI.

Ownership of that substrate – i.e. substrate monopoly – (the substrate that the little blissful sugar cubes or wireheaders (even if “done right“) is, as far as I can tell, ultimate power – at least insomuch as we can imagine power today. So yeah, it’s low status to be at the whim of those who can turn you off at any time. It’s weakness and we should be wary of being helpless drooling pleasure-mongers, even if it’s all happy-go-gradient-of-bliss-y a-la David Pearce.

There is no actual escaping the state of nature – and only the strong as safe. I wish it was otherwise but by golly I don’t see that fact changing anytime soon.

Second, it may be perfectly possible to optimize for both (a) mental capabilities that are conducive to creation, power, action, influence, invention, etc… and also optimize for (b) bliss in many blooming and wonderful forms.

We assume that blissful folks are chilled out and just enjoying life… but that’s only in our present manifestation as hominids (read: Unquestioned Assumptions About Happiness and Motivation) – we can imagine minds that optimize for both – and I expect that’s what world eaters would want, also (so long as that bliss doesn’t prevent them from becoming more capable than their competitors).

Even with a small probability of success, the expected benefit from this is just crazy-high, both in terms of AI safety and well-being neurotech.

Anyway, in terms of optionality — I worry there’s a bit of a race condition with AI today. If we get to a solid understanding of consciousness *before* we get to AGI, I think the trajectory of the universe might be permanently, radically better. If we don’t, well, there might be path-dependent lock-in that screens off certain scenarios of radical value creation. So I’m not particularly in favor of option #1, and so I think consciousness research is plausibly one of the most pressing areas today, particularly if e.g. OpenAI / DeepMind timelines are correct.

I’m 100% with you my man, I’ve written in support of this stance a few times. Consciousness is massively important.

Finally — you might enjoy the following pieces (though I might have linked them to you before):

https://opentheory.net/2019/09/whats-out-there/
https://opentheory.net/2019/02/simulation-argument/

Reading them now, thanks, Michael.

 

Header image credit: soldbyres.com