AI: Reverse Engineering The Brain with Max Bennett

We will never create true artificial intelligence (if we really want that) until we know more about how the human brain works.

Tech entrepreneur and author Max Bennett explains how AI learns, where it falls short, and how it stacks up against our own intelligence.

As it turns out, what’s easy for humans is hard for AI, but AI is better at doing some things that are quite hard for us. Mostly, what AI teaches us is just how remarkable the human brain is – it is much better at continued learning than AI is, and it requires less input to come to conclusions. But… Can we trust it?

Phil Stieg: You have probably heard about ChatGPT, a very popular artificial intelligence technology. You can ask it any question and it will give you an answer. But how exactly does that happen, and how smart is it?

My guest today, Max Bennett, is an AI entrepreneur who can help answer those questions. Max tackles this enormous task in his book, A Brief History of Intelligence: Evolution, AI, and the Five Breakthroughs That Made Our Brains. Max, thanks for being with us today.

Max Bennett: Thanks for having me.

Phil Stieg: So I have to admit, when I got the book in the mail, I had a completely different sense of what I was going to be getting into in reading it. It seems that the thesis of your book is a little bit of reverse engineering.

Understanding how the brain works is going to help us understand how you can start building a comprehensive AI program. I think it’s a very unique approach. What prompted you to do this?

Max Bennett: There’s so much work in the neuroscience community about reverse engineering how the human brain today works. But as anyone in the neuroscience community knows, this is an astronomically difficult task. The problem is that the human brain is just ridiculously complicated. There’s 100 trillion connections, 86 billion neurons. These connections are way more complicated than a connection, an artificial neural network. They pass tens to hundreds of different neurochemicals between them, et cetera, et cetera.

And so I became interested in trying to take sort of a different approach, to add a new tool to our toolbox, to understanding the brain. Which is trying to simplify the human brain, not by decomposing the human brain into its constituent parts.

The alternative approach is just to roll back time and look at the simplest brains, and then track the series of modifications that led to the human brain. So that was sort of the idea.

Phil Stieg
So what kind of simple brains did you look at?

Max Bennett: What is so interesting is if we look at the simplest animals in the animal kingdom, with by far the simplest brains, the fundamental thing that they do is categorize the world by this attribute of goodness or badness.

If you look at a nematode, very small, worm like creatures, 302 neurons—one of the simplest brains in the animal kingdom. It can do remarkably smart things, despite being totally unable to see anything in the world. It just has some very simple chemosensory neurons that detect the presence of certain food smells, the presence of certain predator smells, and it uses these stimuli to engage in an algorithm called taxis navigation, where when good things are increasing in concentration, it keeps going forward, and when bad things are increasing concentration, it makes a random turn. And this basic algorithm of taxis navigation enables very simple creatures to find food and avoid dangerous areas.

And so what’s amazing about that is, you realize that the first brain’s goal was to categorize the world into good and bad things and engage in this very simple algorithm for approaching good things and avoiding bad things.

Realizing this is the foundation, and then tracking the path forward, is really useful, because in AI, there’s a lot of discussion about the reward hypothesis, which is how much of behavior actually derives from these things of reward, and we can see at the very foundation of brain evolution, is this notion of reward.

Phil Stieg: Tell us what simulating is and its relevance, then, to AI.

Max Bennett: So humans, and really, all mammals, show a lot of signs of being able to consider states of the world that are not the current one. The colloquialism we use for this is imagination.

So, for example, David Reddish did my favorite studies on this. You can put a mouse in a maze, and it will learn to navigate this maze to find food in various places. But when it reaches a choice point, so a fork in the road, it will pause and toggle its head back and forth, and then it will make a choice.

So this was just a speculation, which is what they’re doing is they’re imagining possible paths and then selecting the one that, in their imagination, succeeds in getting to where they want to go.

But there was no proof that they were actually doing this until the early two thousands. David Radish and his lab, they recorded place cells in the hippocampus of a rat. If you record play cells as a rat is navigating a maze, what you find is there’s specific neurons that activate at each location in this maze. So, in other words, there’s an encoding of locations in two dimensional space.

Most of the time, these place cells are only activating the neurons for the place that the rat is actually present in. But specifically in these moments where they pause and look their head back and forth, you can see the play cells start playing out the future paths.

And so you can literally go into the brain of a rat and watch it imagining its own future.

This ability to simulate is this remarkable mammalian ability.

And in AI systems, we’ve been trying to replicate planning for a long time. And so we do have systems that do rudimentary versions of planning. So, Alpha Go, you might remember the famous system that beat the best human in the world at Go, did do a version of planning. Before every move it would play out 30 different possible games within several seconds, and then select the one where it believes it would win. But the key difference is there’s a fixed number of next moves that you can make.

But in the real world, we’re in this continuous action space. So even just a squirrel running across tree branches, there’s literally an infinite number of places it could place its hands, an infinite number of directions that it could go. And yet somehow, even the tiny little brain of a squirrel and a rat selects which paths to simulate without being sort of stuck in thinking about things endlessly.

We really don’t know how mammalian brains do this really well. So that is a key part of emerging AI research. If you see people talk about world models, this is one of the key things that people are trying to imbue AI systems with.

In order to simulate, you need to have a rich world model, which means that you can accurately predict what will happen in the world if I take these series of actions.

Phil Stieg: Another one of the breakthroughs you discussed in your book is something called “mentalizing” which you describe as “thinking about thinking.” How does that work?

Max Bennett: Mentalizing has a lot of adaptive benefits, specifically in sort of primate social groupings. One example is theory of mind. So the ability to look at someone else and try to infer what they’re thinking about. So what are they simulating when they’re not actually acting?

Mentalizing also gifts us with unique imitation learning skills. And there is really wonderful synergy with AI here that I think is really cool.

One strategy historically to teach cars to learn to drive was, we’re going to have humans drive and have a camera looking at the road in front of the car. And what we’re going to do is we’re going to get this training data of the road compared to what the human actually does with the steering wheel. And so the idea was great, we’ll just teach a system to imitate what humans do, and then it will be great at driving, because they’re just going to directly wholesale copy what the human does.

And what they found is, that worked some of the time, but then there was catastrophic sort of accidents that would occur.

They were trying to understand why were these catastrophic accidents occurring? And what they realized was the problem is they never saw humans recover from a mistake because they only saw expert drivers. So the second the AI system made a tiny mistake, it had no idea how to recover from it.

One of the key strategies for solving this is something called inverse reinforcement learning, which means you don’t directly copy what you see someone do.

What you do is you first try to infer what they’re trying to accomplish, and then you train yourself to do what they’re trying to accomplish. So in the case of driving, you realize what they’re trying to do is stay in the center of the road, so you don’t directly copy their movements. What you do is you try to train yourself now to stay in the center of a road. And so this is, in part, why non-human primates are so much better at imitation learning than most non-primate mammals. Because primates can look at someone using tools and infer the specific movements they’re trying to do and then train themselves to do that in their mind, as opposed to just directly copying their behaviors. That’s another place where mentalizing is really useful adaptively.

Phil Stieg: The thing that’s uniquely human is our ability to anticipate the future, which is based upon learning. So the question to you, is AI capable of learning, or is it more capable of just collating a bunch of facts accumulated from large language models?

Max Bennett: In the context of “do AI systems learn?” I think even the simplest AI systems are engaging in some form of learning.

So, for example, even the systems that exist in the 2010s, before we had large language models that could be trained on a series of pictures of cats and dogs and then classify whether a new picture is of a cat or a dog, the fact that they can take a totally new picture that was not in the training data and correctly do it demonstrates some form of learning.

Because it used to be the case that AI researchers tried to hard code rules. They would say, well, the way I’m going to build a system that can recognize dogs and cats is I’m going to literally code the rules of what defines a dog and a cat.

And that strategy failed in almost all cases, because what one realizes is it’s actually really difficult to take our sort of naturalist knowledge about things and imbue them into lists of computerized rules.

And what worked way better is you build a neural network with a learning algorithm, and then you just show it data of this is a cat, this is a cat, this is a dog. And it sort of learns to classify it on its own.

Now, that doesn’t mean to say at all that these large language models are doing the same thing to the same degree that the human mind is doing. The way these large language models, quote unquote “learn” is you show them astronomical numbers of chunks of text. And the way you train it is you remove a word, let’s say, at the end of one of these things, and you train it to predict the correct next word. And you do that over hundreds of billions of words.

What has been surprising the AI community is the degree with which this very simple learning algorithm, which is just trying to predict the next word in a sequence of prior words, has enabled these language models to do a surprising number of things pretty well.

So if you go to ChatGPT and you ask it sort of even common sense questions about the world, it does a good job answering them. On average. ChatGPT, if you give it a new bar exam, it passes the bar exam with human level, or sometimes better than average human level performance, et cetera, et cetera

One of the key problems, is that these systems do not build hypotheses about the state of the world and test their own hypotheses about the state of the world. And so what that means is it’s only as good as the data you give it.

Phil Stieg: Good at reacting.

Max Bennett: Exactly, which is different than humans. So, for example, if someone told me and you, “hey, friends, the world is flat, I’m sure the world is flat.” That wouldn’t just be taken as wholesale training data for our model of the world, we would compare that to our understanding of the world, and we could devise a hypothesis and experiments to validate or invalidate this idea.

And so this is a very key difference than large language models, where if you give it false information in the training data, it’s now just going to regurgitate the false information.

And so this is one of the key things that’s missing in the sort of AI strategy of scaling up these huge models, is that they’re not validating their own hypotheses about the world, which means it’s going to be challenging for them to get smarter and more truthful over time without us meticulously managing the data set that they’re trained on.

Phil Stieg: Do you anticipate that the ChatGPT and AI systems are going to just be absolutely humongous to manage all this material? Or will they actually be small, like our human brain?

Max Bennett: So I think this is like a huge, fascinating, open question in what the right strategy for building super-intelligent systems is. In terms of just getting performance, there’s a big schism in the AI community.

One group says that we don’t need to invent more efficient algorithms or better algorithms. We just need more compute and more data. And if you just keep scaling it up, eventually you’re going to get to the point where you get human level performance.

So to compare, GPT-4 has about 1 trillion connections. The human brain has about 100 trillion. GPT-3 had 150,000,000,000 connections. So every time they do this, they scale it up by ten x.

So just based on scale, you could say, we’re two orders of magnitude away from getting to human level scale.

Now, anyone with a biology background in neuroscience will sort of challenge this, because human brain connections are far more complicated and information rich than connections in an artificial neural network. So maybe it’s not exactly right. But despite that, we’re operating on a similar sort of general scale.

The big question is going to be, does that achieve some form of human level intelligence, or do we have to scale even further to just get to human level intelligence because these systems don’t work in the same way? And just in terms of information or energy usage, I mean, the human brain operates on the wattage of a light bulb or two, whereas ChatGPT uses a ridiculous amount of, I don’t know the exact number, but it’s absurdly higher than that.

Phil Stieg: Kilowatts.

Max Bennett: Yeah. This creates one just questions on energy efficiency, but it also is a problem with data efficiency.

So, for example, to be able to pass the bar exam, GPT-4 has read at least 2.5 million books, which is 1,000 times more than any human could read in their entire lifetime. And so what that demonstrates is, although the performance might look similar on certain tasks, the data required to get to that level of performance was so much higher.

If all of a sudden something in the world changes, and we need to give ChatGPT new information to adjust to this new learning, we’re going to need way more data to train it to do this new thing than we would a human.

Interstitial theme music
Narrator: We humans have long been fascinated by the idea of artificial beings. We want them to be smart, of course, but not so smart that they are terrifying, like HAL in 2001: A Space Odyssey. Mostly we want them to work for us–to be the perfect servants who do our bidding and never need time off.
But while AI has made great strides in the past couple of years, there are still a lot of things even HAL wouldn’t be able to do.
As our guest Max Bennett points out, we are still falling short of an ideal that captured public attention more than 60 years ago.
Jetson’s theme music
Jane: Now just make yourself at home Rosey.
Rosey: Yes ma’am, I’ll get right to work ma’am.
Narrator: Rosey the Robot from the 1960’s animated series “The Jetsons” was a fully autonomous mechanical housekeeper with an extraordinary set of skills.
Modern AI programs have been approaching Rosey’s ability to carry on a conversation.
Rosey: The opinions expressed on my own, and do not necessarily reflect those of my employers.
Narrator: But it is her ability to move about in 3 dimensional space and manipulate novel objects that still stumps 21st century robotics.
Elroy: Can you throw a forward pass?
Rosey: I’m not sure, but I’ll try.
Elroy: Whoa, whoa, whoa, wee!
Narrator: Things that are easy for humans–like loading a dishwasher or throwing a football, are actually really difficult for robots. While things that are hard for humans, like complex calculations, are easy for machines. This is known as Moravec’s paradox, named for a robotics researcher at Carnegie Melon who first described this problem in 1988.
Although we are encouraged by the recent gains made by ChatGPT and other large language models, it turns out that the physical world is far more complicated than language.
Running and jumping and grasping objects comes naturally to people as the result of millions of years of evolution. Combine that with a knowledge of the physical world gained from years of play as babies and children, and you have a skill set that is almost impossible to duplicate in an artificial system.
HAL 9000: I’m sorry, Dave, I’m afraid I can’t do that.
Narrator: Perhaps when a robotic AI system has evolved enough to master a mundane physical task, like matching socks and folding laundry, the real AI revolution may have finally arrived.
Jane: Don’t be silly Rosey, you are worth your weight in leftovers.
Rosey: Thank you. And I love you people, too.
Interstitial closing music.

Phil Stieg: In the conclusion of you book, you talk a little bit about veritas, or truth versus values. Is there a way to train an AI system to have a sense of human values?

Max Bennett: OpenAI and other folks try to engage in a process called alignment, which is effectively, you give it more data to look at, which biases it away from things that we deem to be bad responses or immoral responses, or just inappropriate responses, and towards responses that we think are appropriate.

One of the standard ways of doing this is called reinforcement learning with human feedback, where you try to get one of these systems to say something really bad, so you give it prompts that are intended to lead it to say something that most humans would deem inappropriate or immoral, and then you punish it when it does that.

You show two possible responses, and you try and bias it towards the one that a human reviewer would deem less inappropriate. And if you do that millions of times, you’re just giving it new training data to try and bias it away from answering in certain ways.

Someone at OpenAI would say this is at least a primitive version of values because we have human preferences, and then what we’re doing is we’re taking those human preferences and rating GPT responses on the basis of which seems more ethical or less ethical, et cetera.

Now, there are problems with this. Google, some people would argue, has gone way too far in trying to make their system, quote unquote, “really ethical and appropriate.” For example, they trained it to be as diverse as possible in all of its generated image responses, because they were really concerned that it was going to be biased towards only showing white people or whatever. Of course people are trying to get Google to have egg on its face, and so people asked it to generate photorealistic pictures, of Nazis in Nazi Germany. And so there are no white nazis generated. So it’s just like, what? And so people are just like, this is just like, we’ve gone so far, and now it’s just providing weird pictures that are totally inaccurate of Nazis.

Phil Stieg: I really don’t want to go down this pathway. Oh my God, the trouble they must be getting into.

Max Bennett: Yeah and there’s cases where people ask it to do something very simple like, “hey, can you help me answer this C++ code?” And for whatever reason it says, I think that’s inappropriate, I’m not going to help you with this code or whatever. So it refuses to answer in a lot of cases because they’ve gone overboard trying to make it safe.

Now, you could argue, I’m sure there are people that would defend this and saying, I’d rather go overboard with safety than risk saying something wrong. For example, you can go to ChatGPT and it can give you incorrect medical information, and people are really afraid of that.

So I think there’s relatively reasonable debates on all sides of this, but a lot of people are making fun of Google for going to the extreme.

Phil Stieg: So I think you kind of gave us an example of the limitation of AI versus human intelligence. Can you give us another example where our human brain just does it better than AI?

Max Bennett: One really salient example is with continual learning. This is a really big open area of research that I think neuroscience has a lot to offer to the AI world.

A human brain can receive new information, absorb it, learn it, and it has almost no impact on your previously learned information.

So, for example, if I tell you about a new type of car, it doesn’t risk overriding your memories of planes in the past. I mean, there’s maybe some light version of this type of overwriting, but it’s very limited.

In existing neural networks this is called the problem of catastrophic forgetting, and that happens all the time. So if you take OpenAI’s GPT-4, GPT-3, and you train it on a new piece of data, it’s very likely to immediately lose the ability to perform on previously learned tasks. Because of that, we don’t let these systems learn as they go.

The OpenAI releases the model, and it’s not updated. It doesn’t learn as you type to it. It’s just giving you the next prediction, given your input. But that’s why when you talk to ChatGPT a new time, it doesn’t remember who you are, it doesn’t remember your previous conversations, et cetera.

So how we get these systems to learn continuously as new data is coming, without overwriting past memories, is a huge open problem that we know even rat brains do effortlessly. Well, that’s a very key dividing line between mammal brains and AI systems.

Phil Stieg: So in the book, you talk about developing artificial super-intelligence, where intelligence is unshackled from biological limitations. And it conjures up in my mind the concerns about the potential dangers of AI technology. And you do touch upon this.

Will our evolutionary baggage of pride, hatred, fear and tribalism get incorporated into AI? Again it makes me think of 2001: Space Odyssey with HAL. Could that happen?

Max Bennett: Yes, it definitely could happen.

Phil Stieg: That’s frightening.

Max Bennett: I think the question is probability distributions. There’s already so many things in the world that could cause cataclysms.

Right or wrong, we keep making a choice as society, that we’re willing to sort of roll the dice on new technological innovations for the hope of more prosperity, at the risk that we’re going to mess everything up.

We invent nuclear warheads, and now we have them, but we also use them for energy. We invent the ability to genetically modify and edit viruses. We can use that for good. That can also be used for very scary things.

I think it’s likely that we are going to keep rolling the dice on that because that’s the way the sort of global system has been set up.

Now the question is, is it likely that the world is going to end to these AI systems, or is it not? In general, I’m an optimist here.

I think usually it’s the case that the things that we are particularly afraid of are the things we spend a lot of time making sure doesn’t happen. And it’s the things that we tend to ignore that tend to happen.

I think climate change is a good example here, where in general, although people talk about it, most people on a day to day basis until recently haven’t been afraid of it. They just kind of like ignored it. And thus we ended up in this situation.

I think AI may go that direction, but at least now there’s so much salient fear about it that there’s a good chance that we’re actually going to sort of get ahead of it.

One thing I’ll add to that, is I think the more direct concerns I have about AI systems are not sort of humanity ending, but they are things that I think are more likely to have a negative effect on people.

For example, any sort of dramatic change in technology that creates mass layoffs, creates a lot of human suffering. It’s easy for economists to sit in the ivory tower and say, well, look, GDP growth is going up, and hence, in net it’s better for everyone. But if you actually talk to people who spend 30 years doing something, and now their jobs gone to a machine, that’s a life that’s really impacted.

And so I think that’s something we really have to think deeply about, which is if this really does become something so powerful that huge swaths of society are out of work, that’s something we have to reason with. Inequality issues.

I also think misinformation issues. One good example here, I don’t know if you saw this, someone scammed a bank out of wiring, I think, 10 million plus dollars, because all they did is they created generative AI avatars of the CFO and an executive team.

They got on a zoom, and it looked like the entire executive team talking to a financial controller at a bank, but it was actually other people, and they were using generative AI to change their face and their voices. And so not his fault. He was like, great, I’m talking to all these people. They’re all verbally confirming. I can see them and I’m wiring the money. It was all faked.

This is like a new world we’re entering where-how do we verify that we trust who we’re talking to, if I can’t believe what I’m even looking at? And so I think these are the things that I think we really have to be wrestling with now before we think about is humanity going to end?

Phil Stieg: So I’m going to take a quote from your book,”The more we understand about our own minds, the better equipped we are to create artificial minds in our own image.” And I would pose the question, given our propensity for war and things like that, is that actually a good thing?

Max Bennett: Great question. So I think my motivation, and what I think our motivation for understanding the human mind should be, is not to wholesale copy it, but to understand it with enough richness that we can choose of our own volition which aspects of it we want to copy.

You know, as we’ve seen, ChatGPT, it doesn’t really work in a human like way, but it has lots of the same problems as humans, because it’s super biased based on the data you give it. What we’re missing is some of the great things about human brains.

But you’re absolutely right. We do not want to wholesale copy the human brain. I think one of the worst aspects of humanity derives from our sort of primate lineage of status seeking. And I see absolutely no reason why we want to copy and paste that.

This is in part where the dividing debate in the AI community comes from, which is people like Yann LeCun will argue that our fear about AI comes from our anthropomorphizing of it, which is there’s no reason why an AI will have any instinct to dominate or expand or do anything nefarious, because we’re going to create it. So we’ll create it with good values.

And then other people like Geoffrey Hinton would make a counterargument. He wouldn’t put it in his words, but the way I would translate what he would say is that, that might be true for 99.9% of the AI systems we create, but this is going to go through its own natural evolutionary process, and that might not go the way we want, because if 99.9% of AI systems are not dominating. Don’t desire to expand, don’t desire to do anything nefarious, but 0.1% do, is that 1% going to all of a sudden grow? Because that’s the one that’s going to want to expand.

He would argue that even though it’s the case that most won’t have this problem, if any have this problem, it might be catastrophic.

Phil Stieg: So, the evolutionary biology of an AI device!

Max, I can’t tell you how much I’ve enjoyed sitting here listening to you. What I’ve learned from you in terms of AI, ChatGPT, and large language models is transformative in my life. It’s becoming incredibly important, and I believe that you’ve made this subject understandable. Thank you so much for being with us.

Max Bennett: Thank you so much for having me. It was an absolute pleasure being here.

Additional Resources

About Max Bennett

A Brief History of Intelligence

How Love Helps Kids Learn, with Isabelle Hau

Will Loneliness Be the Death of Us? with Dr. Julianne Holt-Lunstad

Calm Yourself! Using Mindfulness to Control Pain (Reprise) with Dr. Sara Lazar

AI: Reverse Engineering The Brain with Max Bennett

Never Miss a Thing!

How Love Helps Kids Learn, with Isabelle Hau

Will Loneliness Be the Death of Us? with Dr. Julianne Holt-Lunstad

Calm Yourself! Using Mindfulness to Control Pain (Reprise) with Dr. Sara Lazar

AI: Reverse Engineering The Brain with Max Bennett