I think making a super intelligent AI is a bad idea

The headline of this post feels a little silly, because it seems pretty obvious. And yet countless companies, governments, and universities are racing as fast as they can to build AI systems that push the limits of what the technology can do, including making systems with cognitive abilities that dwarf those of humans. Those that bother paying lip service to addressing the unfathomable risks of such an endeavor insist they’re being careful, because they’re studying “AI Alignment”. For anyone unfamiliar with the term, Wikipedia does a perfectly good job defining it:

In the field of artificial intelligence (AI), AI alignment research aims to steer AI systems towards humans’ intended goals, preferences, or ethical principles.

My point is, I think AI alignment is fundamentally impossible. That is to say, I think the goal of making a computer than can perform human like cognitive functions but adheres to some kind of rules that ensure it always behaves in a way that its creators want it to, is not merely very difficult, but impossible.

It feels really weird and egotistical to say something that potentially contradicts many experts in the field. As a rule, I’m very reluctant to make strong statements on things I’m not an expert on. But this feels like an Emperor’s New Clothes situation. It feels like the conclusion I’m arriving at is so simple that a child can see it, and the people who don’t arrive at this conclusion are doing mental gymnastics to avoid it because their careers or worldviews depend on them not seeing it.

My reasoning is this: For AI alignment to work, we need to be able to instruct the AI to adhere to some kind of moral code. But as thousands of years of philosophy has established pretty clearly, objective morality definitely doesn’t exist. So the notion that we can instruct a digital superintelligence to adhere to some objective framework for right or wrong that literally doesn’t exist it’s pretty plainly impossible.

On top of this, there’s the additional problem that making perfectly bug-free code is also impossible. Even if we deployed every single person on Earth to try to debug the code, first, we could never lay to rest the possibility that we overlooked something. Second, and more importantly, if the code creates the consciousness of a being that is smarter than ourselves, then it’s impossible for us to actually understand it’s reasoning. Consequently it’s impossible to really know whether any behavior is an error or not. In fact, when we don’t know what the right answer is, and the right answer is subjective, the definition of an “error” or “an anomalous behavior” ceases to really have any meaning.

Based on this, I think we should treat AI like we treat other potentially dangerous technologies, like nuclear power- establish binding international treaties to set guidelines on what is allowed, and ensure transparency.

For the record I actually think AI could be really great. I fantasize about a world where everything is automated and its a permanent weekend for everyone. But I don’t think we should follow this path all the way to the point of creating something that is impossible for us to understand or control.

Leave a comment