The World’s Most Fascinating Creative Problem: How AI Safety Researchers Are Avoiding a Robot Apocalypse


In the last episode, I introduced you to how AI research sheds new light on our creative process. If you haven’t already listened to it, I recommend checking it out before listening to this episode. 

To give you a short recap:

  • For an AI to function, it needs to know
    • A) What options it has available to it and
    • B) a way to score itself. 
  • Creative AI’s use what’s called “Adversarial Algorithms” to create unique, original ideas. The algorithm is split into 2 halves.
    • One half functions as a historian by learning from the past. These lessons then become rules that it can follow to achieve better results.
    • The second half functions as the creative part of the AI. Its job is to create random mutations in the historical algorithm. In short, it chooses 1 or 2 of the rules and breaks them. This ensures that the AI can create something that has many of the same qualities that great artwork has, but new enough to be considered original. 

In this episode, I want to follow up on what I mentioned in the first episode about some of the fascinating problems that AI researchers are facing and what we can learn from their experience. 

In the first episode, I mentioned that AgI, that is “Artificial General Intelligence” is a long-term goal of creativity researchers. We want to eventually build robots that can have the same level of AI skills without us needing to train each specific AI for one specific function. It should be GENERALLY intelligent. 

The current problem in AI safety research is what’s called the “Red Button Problem.” I’ll quickly summarize the problem here, but for a more in-depth look at this then you can check out the video done by the YouTube channel ComputerPhile.

Imagine you build a robot and you ask it to make you some coffee. You teach the robot that it gets 100 points for making you coffee. As the robot walks to the kitchen, you realize that there’s a baby in the robots path. Unfortunately, the robot doesn’t have “don’t walk on the baby” as part of its utility function. All it cares about is getting you the cup of coffee. 

So what do you do?

The easiest solution is to put a big red stop button on the robot so that you can hit the button whenever the robot is about to do something dangerous. 

Here’s the problem though. The robot will know that it gets 100 points for getting you a cup of coffee, but it’ll get 0 if you hit the button to stop it. So you reach your hand out to push the button and what happens? The robot will slap your hand away. Just like with the Mario and Pac-Man AI’s we talked about last week, the robot understands what maximizes points and will also try to do that. Even if you gave it 99 points for the button, the robot would slap your hand away so it could get the full 100 points. 

So let’s solve this problem with a quick solution. What if we give the robot the same number of points for getting a cup of coffee as it’d get for the red button? Now the robot should be indifferent to us pushing the button. There’s no reason to stop us because it gets 100 points either way. 

So you reboot your robot and try again. You turn on your robot, tell it to make you some coffee, and what does it do? It looks at the kitchen, looks at the big red button on its chest, and then turns itself off. Why? Because you just taught the robot that it gets the same number of points for the huge task of making you a cup of coffee as it’d get for quickly pushing the button. You’ve just created the world’s dumbest robot. Whenever you turn it on, it’ll instantly turn itself off again. 

Ok. Let’s solve this new problem with another quick solution. Let’s program the robot so that it’s not allowed to push its own button. 

This should solve the problem right?

Wrong.

Now the robot has a decision to make. Is it easier to make you a cup of coffee or is it easier to somehow convince you to push its button? If the robot could get you to push its button faster than it’d take to make the coffee, then it will try everything it can to manipulate you into pushing the button. And what’s the fastest way to convince you to push its button? It’s the fight you. 

Yes. You just accidentally taught your own robot to fight you. 

There are many more examples of flawed solutions to the Red Button Problem, but you get the idea.

The lesson here is that you can’t solve problems by layering new solutions on top of old ones. In past episodes I’ve called these “Frankenstein Solutions” because you create a monster by continually adding different pieces onto your solution. All the quick fixes make the overall idea low quality and, in my experience, its really stressful for the creator since there are so many moving pieces that aren’t designed to work together. You end up having to force ideas to work together that just don’t want to. It’s far easier to search for an elegant, holistic solution that solves all the problems in the fewest steps possible. 

Each solution has consequences. The Red Button Problem is an excellent example because it shows us how out-of-hand our temporary, quick-fix solutions can get. You cannot solve the Red Button Problem simply by combining smaller solutions together. 

There are potential solutions to the problem that are out there, but this is a problem that AI safety researchers are still struggling with. Even the holistic solutions that they’ve come up with have their own flaws and unintended consequences. 

For example, we can teach AI’s to value human life in the same way that we taught the Mario and Pac-Man AI’s to play their games, through thousands of rounds of evolution. This creates an obvious problem though. Robots only need to take over the Earth once. We can’t afford to create a robot that learns human safety “as it goes.” 

For movie lovers, you might think about the 3 laws of robotics that were made famous in the Will Smith movie “I, Robot.” These laws first appeared in Isaac Asimov’s sci-fi books in 1942, so their over 75 years old now. They go like this:

  1. A robot may not injure a human being or, through inaction, allow a human being to come to harm
  2. A robot must obey the orders given it by human beings except where such orders would conflict with the First Law
  3. A robot must protect its own existence as long as such protection does not conflict with the First or Second Laws

Good. Problem solved, right?

Here’s the problem. We cannot teach a computer exactly what it means to be human. The issue with AI is how it interprets what we’ve written in the code. For these laws to work, we’d have to perfectly summarize and codify what it means to be human. We’d also have to perfectly define what harm means. 

Even then, it wouldn’t work because humans don’t actually minimize our own harm. If you’re car ran on an AI that was incentivized to minimize harm, then it’d go 1 mph down the highway. 

The truth is, humans are always balancing risks. We don’t value the safest option. In fact, we all have our own preferences. The person speeding down the highway is doing it for the same reason that grandma is in the slow lane. They’ve both found their own optimal balance between risk and speed. There’s simply no way to teach a computer how to balance these preferences. 

So, in sum, the 3 Laws of Robots won’t be useful until every philosopher in the world has properly defined what it means to be human and what the best rules are to live by. If that happened, we could teach it to a computer, but until then we’re forced to face the fact that our human-ness isn’t rational. 

As you’re creating today, think about your own creative problems and what holistic solutions are out there. 

Don’t treat each individual problem with one solution. Try to find one elegant, simple solution that solves several problems. 

It’s easy to see these problems if you’re in business. As someone who runs an only company, I’m constantly juggling different problems. Each solution to one problem has a rippling effect, just like the Red Button solutions we saw earlier. 

For example, people want to trust you before the buy from you, so being less sale-sy is a good idea… but if you’re not sale-sy enough, you don’t ask people to actually buy the product. You also want to give away free stuff so that people want to sign up for newsletters and stuff… but you have to find the best balance between what you give for free and what you sell. And around-and-around we go. Each solution creates a new problem that ripples through the whole. The only way to solve it is to deal with the entire problem holistically.

Instead of creating a Frankenstein Solution, find an elegant one. Solve many problem with as few moves as possible. You’ll find that the overall idea is much higher quality. It also means your final solution will be less complex. That’s good for you and your customers. You get the sidestep the stress that comes with overly complex problems and your customers get an elegant solution that is easy for them to understand and value. 

People judge the value of ideas using their gut instinct. Complex solutions might solve a problem, but the gut feeling that it gives customers isn’t a positive one. This is why products that are incredibly simple and easy to use are so popular (Think Mac vs. PC). It’s not that they solve the problem better, it’s that they make the customer FEEL better about them. It’s one less thing for them to worry about. 

If you struggle with perfectionism, this might be difficult for you. It’s hard to know that you could potentially solve a problem and not do it, but sometimes you have to sacrifice to make sure your creative ideas are simplistic and easy for customers to enjoy. I already spoke about my own struggle with while writing my book in the episode on personal beliefs from a few weeks ago. The trick is realizing that you want to create a product or idea that has the best chance for success. Oftentimes, the simplistic ideas are the best ones, even if they don’t always feel that way for the creator.

Red Button Problem Video:

AI Safety Research Solutions (Computerphile)

Show Links:

Facebook.com/KaizenCreativity (Interact with other listeners)

JaredVolle.com/Podcast (Find useful links)

JaredVolle.com/Support (Donate or sponsor a show)

https://linktr.ee/JaredVolle (more about Jared)

https://pod.link/1547164462 (Where to listen)

%d bloggers like this: