Is AI Safety Research a Waste of Time?

Many months ago, Peter Voss came through San Francisco and we sat down for a coffee at my favorite place in Hayes Valley, Arlequin. We chatted a bit of business first. Peter runs, and his work on natural language is a lot of fun to discuss. Similarly, we chatted a bit about Emerj, our SEO strategies, our trials and challenges with hiring for key roles, etc.

Quickly, however, we got down to a topic that Peter has thought about for much longer than I have: AGI. It was nearly 5 years ago when I first interviewed Peter about strategies to develop artificial general intelligence.

Peter has written a good deal about AGI and ethics, and we ended our last chat in the heat of disagreeing about whether more intelligent machines would inherently be more morally “good”. I promised Peter that I’d put together my own thoughts on the matter and keep the conversation going for our next chat – and this short article is an attempt to do that. I’m sure that after another coffee with Voss, there will be more to write about!

The bolded and indented quotes that I draw from here are copied directly from Peter’s own short article called AI Safety Research: A Road to Nowhere, which you can find here:

Peter begins the article:

I’ve just returned from a two-day conference focused on the ethics and safety of advanced AI held in NYC. About two dozen speakers, including several luminaries such as Stephen Wolfram, Daniel Kahneman, Yann LeCun and Stuart Russell explored ways of ensuring that future AIs won’t cause us harm, or for that matter come to harm.

Notwithstanding some very smart speakers, and a several interesting talks and discussions, overall these deliberations seemed little more than mental masturbation.


Here are two core problems:

Most of the talks involved questions of either the moral disposition or status of an AI — yet pretty much everyone agreed that they had no idea of how to define morality, or to select the right one for an AI.

Similarly, while the issue of machine consciousness was considered key, general consensus was that nobody really knows what consciousness is or how one could know if an AI possesses it.

I’m with you on the point that our limited knowledge of consciousness is disappointing and limiting. Consciousness is the bedrock of moral relevance (or as I’ve framed it in my ontology: Consciousness is what counts), as far as I’m concerned, and our limitations in understanding consciousness are a damned shame. Given what I know about the world and morality at present – I dedicated the first 4 minutes of my first TEDx to articulating this idea alone.

Furthermore, mainstream assumptions in this field are rife with questionable core assumptions:

1. That we can (or should try to) explicitly design or craft a utility function to ensure that the system acts morally (whatever that may mean).

2. That advanced AI’s moral ability or knowledge are independent of its intelligence.

I tackled this in a previous article about Voss’s core assumption that more intelligence tens to mean more moral behavior, which I disagree with vehemently (but cordially and non-dogmatically, of course).

3. That reinforcement learning will be a significant aspect of advanced AI

4. That AIs will inevitably develop strong, overriding goals of self-preservation and rampant resource accumulation

I hope to have the time to write an article about this in the somewhat near future, but I’ll say that it doesn’t seem completely far-fetched that this might happen. Some could argue that Omohundro and Hawking’s various work in this area of AI self-preservation might be a little overreaching at times, but I wouldn’t agree that it’s all in vain.

5. That there is an inherent, hard problem to get alignment between what users really want and what the AI thinks they want (The Alignment Problem)

Being as we haven’t run into this situation in the real world – it seems hard to write it off entirely as “handled” – as if the future has already happened, and our pet methods for approaching Alignment are proven in full. It seems challenging to write these concerns off entirely.

6. That a system will achieve general and overall super-human intelligence with essentially no warning, or with sudden, totally unpredictable behavior

Indeed to “AI Foom” is a debate unto itself, and one that I’m not qualified to comment on in depth. Maybe nobody is, but I’m only familiar with Yudowski and Hanson’s arguments on the matter.

7. That closed logic or mathematic models can tell us much about how advanced AI will behave

AI safety is a genuine concern, one that we should certainly pay attention to. However, little progress will be made and much unnecessary hand-wringing and money will be wasted pursuing it with no clear understanding of either the nature of consciousness or morality — and with a starting point that embodies several incorrect assumptions.

Speakers repeatedly claimed that ‘no one knows what consciousness is, how we can determine the right moral code, how to solve the ‘alignment problem’, or how to imbue an AI with moral knowledge’. Perhaps the tens of millions currently funding AI safety research could be spent more effectively by involving more people who do not claim such ignorance.

I’m interested in your simplified and uniting view of morality that would finally solve all of these issues.

I’m not personally convinced that AI safety research is a total waste of time – but I could concede that some of it might be overly preemptive, or grasping at straws. Given the nascent stage of the field, there are arguments that funds should be sent elsewhere – but given the reasonably swift development of the field, there are arguments that more should be donated to AI safety than is being done now. I’m not firmly in either camp, but I’m unable to write it off as a waste of time.

It seems that – worse case scenario – we’re laying some foundational ideas for the “possibility space” of problems managing AI, and the constellation of possible responses and approaches to deal with these challenges. Maybe when AGI is on the horizon, it’ll be drastically different than humans imagine now, and all this thinking will be for naught, but it seems that – at least on some level – it’s worth a shot. At least some of it.

This article is part of a broader theme of “Reflecting on What I’ve Read” – where I consider and challenge the ideas of my friends in the AI ethics world, and from authors and thinkers who I’ve read recently. If you have suggestions for topics I should consider for future AI and ethics-related articles, feel free to use the contact form here on

Header image credit: IT Pro