Geoff Hinton on The Robot Brains Season 2 Episode 22
 

Pieter Abbeel 

In last week's episode, we had Geoff Hinton on the show. We covered so much ground from the early days when very few people were working on neural nets and deep learning, through the ImageNet, AlexNet breakthrough moment through Geoff's current work and vision for the future of AI. As you might recall, we also gave you an opportunity to contribute questions through Twitter. In today's episode, we'll discuss some of these questions with Geoff. But before we dive into our very last episode of season two, I just want to say it has been such a pleasure and honor to have so many amazing guests on the show this season. We had guests explaining how AI is being used in real businesses today, like for Flora Tasse on building AI for customer service, Amit Prakash on helping companies use AI to make better decisions, and Benedict Evans on what really matters about tech today. There were guests using AI to solve major health issues like George Netscher on using AI to protect the elderly with fall detection, Athelas’ Tanay Tandon on using AI to improve blood testing, Andrew Song on how AI is helping give back hearing to people worldwide, and Param Hedge on using AI to improve training and prevent injuries in sport. We had guests using AI for social good like Ayanna Howard tackling bias in AI, Revolution Robotics’ Jared Schreiber on teaching children about AI robotics and David Rolnick on using AI to fight climate change. We also had guests who are using AI in industry giants like Microsoft's Eric Horvitz on using air for the greater good and Shakir Mohamed from DeepMind on weather prediction. There were guests using AI and consumer applications like Spotify’s Gustav Söderström on AI in delivering personalized experiences, Amit Aggarwal from THE YES on using AI to serve up a better experience in fashion and Etsy's Mike Fisher on AI in e-commerce. We had guests using AI and transportation and futuristic vehicles like Adam Bry on using AI to power Skydio drones and MIT’s Cathy Wu on the future of our roads and Alex Kendall on Wayve’s driverless cars. We had guests making AI accessible for all through open source like Ross Wightman and Hugging Face’s Clement Delangue. And we kicked off and ended our series with academic leaders in the field like Sergey Levine from UC Berkeley on current research challenges in AI. And of course, last but by no means least, Geoff Hinton. Speaking of which, let's get to your questions for Geoff. Geoff, thank you for making the extra time for audience questions. It’s actually the first time we’ve done this on the podcast and we had so many questions on Twitter for you. It's clear so many people want to learn from you and have questions for you. Hopefully we can get through a bunch of these questions. Let me kick it off with a question from somebody you are very familiar with, Ilya Sutskever. What were some of the hardest times, research wise, on the path to making deep learning work? Was there ever a time where it just wasn't clear how to even make the next step? 

 

Geoff Hinton 

I think it was always the case that there were things worth trying, even if they didn't work and consistently didn't work. I never reached a point where I thought there's just, I can't see where to go from here. There were always many possibilities, many leads to kind of follow up, most of which ended in dead ends. But I think good researchers always have, dozens of things they'd like to try. They just don't have time to try them. So for me, there was never a point where I thought, it's completely hopeless. There’s particular algorithms that at times I thought that’s completely hopeless, like Boltzmann machine learning. Sometimes I think it’s hopeless, sometimes I don't. But the whole enterprise of which, I could now phrase as, can you find objective functions and get the gradients so that you can learn by stochastic gradient descent? That whole enterprise always seemed to me to be, there were always directions you could go to push it forwards. 

 

Pieter Abbeel 

And I have a second question from Ilya, a very different question. Are you ever concerned that AI's becoming too successful and too dominant? 

 

Geoff Hinton 

Yeah. The two things that concern me most are it’s use in weapons because that will allow countries like the United States, for example, to have little foreign wars with no casualties by using robot soldiers. I don't like that idea. Even worse, its use in targeting particular subpopulations to swing elections. So this kind of stuff was done by Cambridge Analytics and that I believe was very influential in both Brexit and the election of Trump. I think it's very unfortunate that techniques like deep learning are going to make that kind of operation more efficient. 

 

Pieter Abbeel 

Question from Pouria Mistani, is deep learning hitting a wall? Will AGI be achieved by scaling up neural activities in deep learning architectures? 

 

Geoff Hinton 

It won't be achieved just by scaling up neural numbers of parameters or neural connectivities, but it's not hitting a wall. I recognize where that quote comes from. It's a sort of attention grabbing quote. And this is regularly said that deep learning is hitting a wall and it regularly keeps making more progress. And if any of the people who say it's hitting a wall, would just write down a list of the things it's not going to be able to do. And then five years later, we'll be able to show we've done it. 

 

Pieter Abbeel 

I like that notion that anybody who wants to claim something is hitting a wall to make a list of things it cannot do. And that's great inspiration for all the rest of us to see if we can make it happen or not. 

 

Geoff Hinton 

But it has to be fairly well defined what it can do. Like there was Hector Levesque, who's a symbolic AI guy, and a very good one. Actually made a criterium, which is the Winograd sentences where you say things like the trophy would not fit in the suitcase because it was too small versus the trophy would not fit in the suitcase because it was too big. And if you want to translate that into French, you have to understand that in the first case it refers to suitcase, and in the second case it refers to trophy because there are different genders in French. And the only machine translation with neural nets was random. It couldn't get the gender right when it was in French. And it's getting better all the time. But at least Hector made a very clear definition of what it would mean for a neural net to understand what was going on. And we're not there yet, but I think we're considerably better than random. But I'd like to see more of that by people who are skeptics. 

 

Pieter Abbeel 

Great challenges, yeah. Next question is from Eric Jang, actually, one of your colleagues at Google. What are three questions that keep you up at night? Not necessarily restricted to machine intelligence.  

 

Geoff Hinton 

When is the attorney general finally going to do something? That keeps me up at night. Because time's running out. That's what I worry about most. How does the world deal with people like Putin who have nuclear weapons? And does the brain use backpropagation or not? In that order. 

 

Pieter Abbeel 

Love the contrast of that. Third one with the other two. Eric had another question, I'm going to go here. You spent years working on topics that the mainstream machine learning community thought was niche. What advice do you have for contrarians trying to produce the next AlexNet result?

 

Geoff Hinton 

Just trust your own intuitions. I have the standard thing I say, which is maybe you got good intuitions or you haven't. If you haven't got good intuitions, it doesn't matter what you do. If you have good intuition, you trust him. But of course, that needs to be patted out with where do intuitions come from? And good intuitions come from a lot of hard work trying to understand things. And basically I think we're based on generality machines. So lots of experience with similar things is where intuitions come from. So you just need a lot of experience and then trust your intuitions.

 

Pieter Abbeel 

Your next one comes from Danielle Newnham. What is the connection, as you see it, between mania and genius? 

 

Geoff Hinton 

Oh, that's very interesting. I'm slightly manic depressive, so I tend to oscillate between having very creative periods when I'm not very self-critical. And having mildly depressed periods when I'm extremely self-critical. And I think that's more efficient than just being kind of uniform. So what happens in manic periods is you just ignore all the problems. You're so sure there's something exciting here. Yeah, sure, there's all those obvious problems but don't let it stand in our way. Let's get on with it. But then when you're depressed, all these obvious problems overwhelm you. And the question is, can you keep going and sort them out and figure out whether the idea really was good or not? And I tend to sort of alternate like that, which is why every so often I tell people I figured out how the brain works, and then I go through a long period of figuring out why that isn't actually true, which is slightly depressing. I think it's just got to be like that. There's a poem by William Blake. It has a pair of lines they would go, joy and woe are woven fine, a clothing for the soul divine. And it's basically saying that's just the nature of being joy and woe together. And I think that's the nature of research, too. And if you don't get really excited and you don't get really fed up when it doesn't work, then you're not a real researcher. Well maybe you're just a different kind of researcher. 

 

Pieter Abbeel 

There's a related question as part of that, what childhood experiences shaped you the most and how? 

 

Geoff Hinton 

I think the most formative experience was coming from the home in which everybody was clear that religion was nonsense. And being sent to a private school, which was a Christian school where when I was seven, when I first went there, everybody believed in God. Everybody except me that was. That was a very formative experience for me, possibly because I got a large ego, I realized that everybody was wrong. But having that experience of seeing everybody else being wrong and gradually over the years, seeing them change their minds and seeing these teenage boys say, well, maybe, maybe God isn't real. That was very helpful. 

 

Pieter Abbeel 

Next one is from Bishal Binayak. What's your thought process to solve a research problem? Is it mainly focusing on machine learning? Probably implying maybe, you know, the question is, do you also need to think about other fields? 

 

Geoff Hinton 

Because we don't necessarily have good insights into our own thought processes. I guess I tend to work a lot with analogies, so at least I’m consistent, that is, I think the basic form of human reasoning is analogies which are based on having the right features and big vectors. And that's how I do research too. I try and look for similar things and maybe it's not so much try as similar things sort of pop into my mind and I think everything I'm doing is a kind of result of these analogies with many, many other things via these feature vectors where I'm sort of basically unaware of many of these analogies, but in a different way. That's not very helpful, but I don’t really know. 

 

Pieter Abbeel 

The following question, here, also from Bishal. What's the next big thing on AI and advice for PhD students to which research area to focus on? 

 

Geoff Hinton 

I think a next big thing, I don't think there is the next big thing, a next big thing is going to be a convincing learning algorithm for spiking neural nets. That are able to deal with the discrete decision about whether to spike or not the continuous decision about exactly when to spike. And that makes use of spiked timing to do interesting computations that would be much more difficult to do in non spiking neural nets. That would be my bet about one of the big things. But the other thing and what the reason the deep learning revolution is going to keep going is that actually, if you just make a bigger one, you don't need any new ideas, you already get things working better. It’s slightly depressing if your trade is new ideas, but if your trade is how do you build hardware to make a bigger one then it's great. 

 

Pieter Abbeel 

The next one is from thinkorswim. What is Professor Hinton’s regret in research choices so far, that is something he wished he had delved into, but chose not to, and now perhaps a regret looking back?

 

Geoff Hinton 

Time is short. So I’ll just say a learning algorithm for spiking neural nets. 

 

Pieter Abbeel 

You wish you'd already done it, but now you can still do it in the next year. 

 

Geoff Hinton 

Yeah. Maybe.

 

Pieter Abbeel 

Yordan Hristov has the following question, how important is embodiment for intelligence given the recent DALL·E results from OpenAI? And I'll say I'm personally really curious about that too, working on a lot of embodied intelligence myself. 

 

Geoff Hinton 

Yeah. So I think one needs to distinguish the engineering version of this question from the philosophical version of this question. So the philosophical version is, could a being sitting in a room listening to radio and watching a television figure out how the world worked, even if it couldn't actually move anything? It just gets these sensory inputs. And that's a philosophical question. I think it could. The engineering question is, is that a good approach just to listen to the radio and watch television? And I think the answer is definitely no. If you want to do perception, for example, as soon as you put one or two cameras on a robot and let the robot move around in the world, you get a very different view of what the questions are and how to solve them. Then, if your idea of doing perception is just to take a database of images like ImageNet because you have the option of changing viewpoints and seeing how these move and change viewpoint. You have a task to do. You have to be able to ignore things that aren't relevant. You really would like to have a fovea so you can see fine detail without swamping yourself all the time. It completely changes how you build your perception system. So philosophically, you don't need to be embodied, but actually as soon as you're embodied in a sensible way, it changes how you're going to do things. So for engineering embodiment is important. However, there's a lot of hassle that comes with embodiment. You have to deal with the body. So I think we can still make lots of progress on databases of just videos where there was somebody making the video, but basically you're just taking the video as data. There's lots of room for working like that without having a mobile robot. We don't control the data collection, but a long time ago, Dana Ballard, probably back in the eighties, Dana Ballard realized that animate perception when you've got a robot moving around is just going to have a very different flavor from standard computer vision. I think he was completely right. 

 

Pieter Abbeel 

Next one is from Renjith Ravindran. Why do you do what you do? Do you believe it would make the world a better place? Or are you just having fun exploring the limits of human creativity? 

 

Geoff Hinton 

Much more the second one, I'm afraid. So I really want to understand how the brain works. And I believe that to understand it, we need some new ideas. Like, for example, a learning algorithm of spiking neural nets. 

 

Pieter Abbeel 

Do you think it's, this is a follow up question of my own, you think it's almost necessary to be really driven by the exploratory aspects? Or is it possible to be just as productive in research if you care more about the bottom line effect on the world? Is it just a different style?

 

Geoff Hinton 

I think if you want to do fundamental research, there has to be curiosity. You’re going to do your best research when it's curiosity driven. You're going to be motivated to ignore all the apparent barriers and pretend they're not there and see where you get. Whereas if it's for the bottom line, I just don’t think you're going to be as creative. So I think the sort of the very best research gets done by graduate students in good groups with plenty of resources. So you need to be young and driven and really be interested in something. 

 

Pieter Abbeel 

Next one is from Peter Chen, actually my co-founder and CEO at Covariant, you know him. He has a research organization question. You've been doing pure academic, basic research at the university. You've done industry basic research at Google Brain, and you've also seen industry applied research while at Google, as well as some people you know, who are involved in startups and so forth. How do you think of these different places as providing, maybe, different opportunities to make research go forward, but also from there built products?

 

Geoff Hinton 

To be honest, I don't think that much about building products. Products are nice. They pay the bills and companies would like to have products. It's not what I really care about. What I really care about is how do you make big learning systems and how does the brain work. And the nice thing about the brain team at Google is they have the resources to explore big systems and lots of smart people to discuss things with. And maybe I should care more about products, but I believe in specialization. And so having everybody care about products is not necessarily the right mix. 

 

Pieter Abbeel 

The next batch of questions is all centered around the brain. So I'm going to give you all the questions in one go, Geoff. And then you can see what perspective you want to give on this whole thing. The first one from Lucas Beyer is how does the brain work? Then Tim Dettmers, what's your take on mixed learning algorithms: backprop in cell body dendrites plus feedback alignment across neurons? Could such algorithms be both biologically plausible and competitive with pure backprop, or is a single general algorithm more likely to exist? Prasad Kothari is wondering about spiking neural networks. Cedric Vandelaer, it seems you have drawn inspiration from the human brain in the past. Do you think there are certain techniques that will eventually turn out to be crucial? For example, spiking neural networks. at and commander. Aton Kamanda, Geoff recently declared that he finally didn't think the brain was doing backpropagation, but it might be doing something akin to the Boltzmann machines. Does he see this kind of architecture comeback as a viable AI model or as a theoretical model for how the brain works? And then the last one by Yigit is about the NGRAD hypothesis. So it's a lot of related questions here. 

 

Geoff Hinton 

There's one set of issues, which is if the brain is going to do something like backprop, how does it get gradient information to go backwards through the layers. And that's what the NGRAD hypothesis is about. It is the idea of using changes in your neural activity, to represent error derivatives, using temple derivatives for error derivatives. I didn't really believe in that anymore. So let me go to the question of Boltzmann machines and do I believe in both machines? I wax and wane on Boltzmann machines because it's such a neat idea. But right now I believe in part of that, but not the main thing. So Boltzmann machines have these Markov chains which require symmetric weights, which are implausible. But there's another aspect of Boltzmann machines that I mentioned in the podcast, which is that they use contrastive learning so Boltzmann machines are more like a GAN than like typical unsupervised contrastive learning. In an unsupervised contrastive learning, you take a pair of crops of the same image and make their representations similar and look at the pair of crops of two different images and say make that representation not dissimilar. In a Boltzmann machine, you take positive data and say, have low energy for the positive data. You take negative data and you say, have high energy for the negative data. So, but the data is just an image, it’s not a pair of images. It's just an image. And I believe in that now. So I think that if we're going to get unsupervised contrastive learning working, what we need is to have two phases like in a Boltzmann machine. We can have a phase when you try to find structure in positive data. But not in pairs, the whole image. You're looking around for, essentially agreements between locally extracting things that detection predicted things. And then we need a different phase in which I show you negative images, things like real images that aren't real or slightly different. And what you're concerned with is that the structure you find in real images shouldn't be in these negative images. So you want to find things that are in the positive data and not in the negative data. And that's how you protect yourself from finding structure inside your neural network. This is caused by the wiring of the front end of the neural network. Anything caused by the wiring will cause the same structure for positive images and negative images. And so you can filter it out. So there's an aspect of Boltzmann machines I really believe in, which is you have to use positive and negative data to protect yourself from just learning about your own wiring. But the idea of a Markov chain to generate the negative data, I think, is just too cumbersome. I think we need other ways of generating the data and this is quite like GANs, right? So in GANs you've got real data and you've got data generated by generative models and that's negative data. And if you compare what I believe now with GANs, what I believe is that the discriminator, which is trying to decide is this real or negative data. And by finding structure, it should only be there if it is real data. That's the sort of main thing. And I want to use the internal representations of the discriminator as a generative model, in order to get the negative examples for training. So what I'm doing is kind of, what I believe in now is sort of a cross between GANs and Boltzmann machines. But in GANs, it's not a Markov chain, it’s generative models just a causal generative model, a directive generative model which is much easier. And I think you probably have a discriminator. And then the directed generative model that's around at the same time for the negative examples. 

 

Pieter Abbeel 

In principle, there is a unification right? Because GANs can be rewritten as energy based models. Also just a specific form of them. 

 

Geoff Hinton 

But the thing about GANs is you generate from random stuff at the top. And it's hard to get coverage. There might be all sorts of things you never generate. You wouldn't know. If your discriminator, you go to the top level of your discriminator, and then you regenerate from the top level of your discriminator, you'll get coverage. So in a paper with the wake-sleep algorithm that I published in 2006 with Simon Osindero and Yee-Whye Teh,on neural computation, we have something that doesn't use backprop. It manages without backprop. It uses contrastive wake-sleep. And the contrasting aspect is that you do recognition, that's the sort of wake phase and then you generate. And what you generate from is not random stuff, but a perturbation of what you got when you did recognition. And that gives you coverage. So that's I think there’s maybe unification coming on. 

 

Pieter Abbeel 

That seems a very concrete idea, ripe for execution and could get some amazing results.

 

Geoff Hinton 

It’s actually running on my computer right now. It is running on my computer right now. Oh, you're. 

 

Pieter Abbeel 

Oh you’re running it right now. Got it. And then the other batch of questions related to the brain was, of course, on spiking, the role of spiking. 

 

Geoff Hinton 

Well, I think it's very important. I think from very early on in neural nets, Minsky and Papert they hit on the XOR as the thing that a neuron couldn't do, right? They couldn't tell whether two inputs were different. It's an exactly equivalent problem to solve the same function. You can't tell whether the two inputs are the same. Obviously, if you could do one you could do the other. It’s unfortunate that they went for XOR rather than the same. Because if you go for same and say well, I'll artificially neurons can’t tell if our two inputs are the same, you're immediately drawn to the idea that, well, if you use spike time, you can tell whether two spikes arrived at the same time, because then they push a lot of charge into the neuron at the same time. And we will put it above threshold, particularly if the excitatory inputs followed by some inhibitory inputs. So they have to arrive in a narrow window. So spiking neural networks are very good at detecting agreement and our normal neural networks need several layers to do that. And if we could just get a good learning algorithm, I think we would discover that they learn to make use of that ability, just like they learn to make use of it really well for doing auditory localization. 

 

Pieter Abbeel 

When I think about the transformer architectures. They're also kind of designed to find agreements or correlations. Just a much more, I guess, much bigger piece of machinery than maybe a spiking architecture. But it seems like there could be some connections there. 

 

Geoff Hinton 

I mean, there've been neuroscientists saying for years and years that it'd be crazy not to use the spike time. And there are people like appearance and talked about seeing functions. Um. It would be very satisfying to find a learning algorithm for these things and show that when you start learning, particularly on sequential data like auditory data, then they really do make use of the spike times in a sensible way. And then you could use the spiking cameras. Spiking cameras are very clever things that give you lots of information, but nobody knows how to use it. Same with your auditory domain, people like Dick Lyon have been saying for years, we should be using spiking neural nets to represent auditory input, but nobody knows how to then take that representation and learn on it and do things with it. 

 

Pieter Abbeel 

There's a follow up question of my own, but if I think about spiking and let's say I try to play devil's advocate here and try to maybe argue against a strong belief in spiking, I might say something along the lines of, well, maybe the reason we have spiking in human brains is because maybe evolutionarily it was easier to somehow evolve or due to random luck of the draw, we evolved spikes. But we didn't evolve wheels and wheels are maybe more effective at transportation. 

 

Geoff Hinton 

Oh no, we did evolve them. You do have wheels. 

 

Pieter Abbeel 

Oh, I do? 

 

Geoff Hinton 

You do have wheels. You just need to think straight. So you have to go over rough ground, right? So you need a wheel with a six foot diameter. And that's going to be a lot of rim. Okay. So as soon as you know about timesharing, you decide, well, here's what I'm going to do. I'm going to have a wheel with a six foot diameter but I'm actually only going to have two little bits of the rim. And I'm going to alternate between using these two bits of the rim and I'm going to use it as a wheel. So I'm going to rotate about the hip, which is going to be a very low energy way of walking. And then I'm suddenly going to switch because I have to get this to go backwards. I have to fly back, all the way round. I'm going to fly back and then I'm going to use the other leg, the other bit of rim. And there is one other big difference, which is a normal wheel, the axle is suspended from the top of the wheel. And there's pressure in the side of the wheel to hold it up so the spokes are in tension. You have to have something like the rubber tire for rough ground. So what you have is you, instead of spoke that’s in tension, you have a spoke that’s in compression. You just have one of them for each bit of rim. But it can bend in the middle. That means you don't need tire because you can absorb a lot that way. And you don't have too much unsprung weight because you can get a bit of the rim. But it's basically a wheel. It's just a time-shared wheel. And now there is one other little advantage the time-shared wheel has, which is you don't have a problem in getting nutrients in because it doesn't go all the way around. It just goes forwards and backwards. So you can have blood vessels go to it more easily. But mechanically that's just an energy supply problem, mechanically, it really is a wheel. You're using it just like a wheel, a little bit of rim and you're rolling like a wheel does. So you use very little energy. And then you quickly substitute one piece of rim for the other. I’m surprised you didn’t know we had wheels.

 

Pieter Abbeel 

So maybe my bad analogy aside, Geoff, do you think there's any possibility that just biking was easier to evolve and that's why we ended up with?

 

Geoff Hinton 

No, I think no, I think there's a very good reason for using it. 

 

Pieter Abbeel 

Got it. 

 

Geoff Hinton 

But I don't know what it is. I think it's to do with coincidence detection. Next thing you'll be saying is that when we make flying machines, we don't give them feathers. 

 

Pieter Abbeel 

Well, I wasn't going to go there after you so eloquently had told me I have a wheel. 

 

Geoff Hinton 

That's what's wrong with drones, right? If you have a drone and the blade hits something, either it breaks a thing or it breaks the blade. If the blade was made of little bits of velcro that zipped together, when it hit something, it could break. And then the drone would land and it’d do a bit of preening and zip the velcro back together again and it could fly off again. So really we ought to make drones with feathers instead of rigid blades so that they could hit things without damaging them and without damaging themselves. And have something that would preen the feathers back together again and off we go. So those are the two classic examples. People don't have wheels and airplanes don’t have feathers. Well, they're both wrong. Drones don’t have yet, but I think they will. 

 

Pieter Abbeel 

I’ll be interested to see when that happens. So the next couple of questions are again, quite related so I'm going to ask them in batch. Abdullah Hamdi asked, what's the next paradigm shift in AI after deep learning? Ajay Divakaran, does the current deep learning paradigm suffice for transfer learning like humans? Or does it need to be fundamentally enhanced? And Arun Rao, what are the next milestones for deep learning going from existing foundation models to a long term goal of AGI? And how does Hinton define AGI? 

 

Geoff Hinton 

I try to avoid defining AGI and I try to avoid working on AGI because I think AGI, there's all sorts of things wrong with the vision of AGI. It envisions an intelligent human like Android, that’s as smart as us, and I don't think intelligent is necessarily going to develop like that. I think I'm hoping it develops more symbiotically. It is very individualistic and we developed in communities. We develop, so this goes back to what you said in the podcast about ants and so on, I think intelligence develops in societies better than it does individualistically. And I think maybe we'll get smart computers, but they won't be autonomous in the same way. They may have to be if they're for killing other people. But hopefully that's not where we are going. 

 

Pieter Abbeel 

Yeah. The earlier part was about the next transition? What's next after deep learning? I mean, that's the question. I'm not trying to apply that there is something next, but that's the question. 

 

Geoff Hinton 

Right. So what I believe is this, that we won't, we're going to stay with the very successful paradigm of tuning a lot of real life parameters based on the gradient to some objective function. I think we'll stay with that. But we may well not be using backpropagation to get the gradient and the objective functions may be far more local and distributed. That's where I think we are headed. 

 

Pieter Abbeel 

Next question is some Dystopia Robotics. Are you familiar with Richard Sutton's The Bitter Lesson? 

 

Geoff Hinton 

Oh, yes.

 

Pieter Abbeel 

And what are your thoughts on it? 

 

Geoff Hinton 

I sort of have it in my lectures that deep learning depends on two things. It depends on doing stochastic gradient descent in big networks that have a lot of data and a lot of compute power. And then on top of that, there's a few ideas that make it work a little bit better. Things like dropout and all the stuff we've worked on, all make it work a little bit better. But the crucial thing is lots of compute power, lots of data and stochastic gradient descent. And I agree with it. 

 

Pieter Abbeel 

Next question is from Prabhav Kaula. How do you read research papers? How to get past the mathematics and get a taste of the core message?

 

Geoff Hinton 

Okay. I don't read many research papers. I basically get my colleagues and my students to explain them to me. I'm hopeless at mathematics. I can do what I have to, to justify something I've already thought out. Like Boltzmann machines, I've figured out how they would work and then did the math to show that's the right thing to do. But I'm not very good at math, and I always find it a big barrier reading papers to understand all the notation. And I find it much easier if I get, so for neuroscience, I get Terry Sinofsky to explain it to me. And for computer science, I get my grad students to explain to me. 

 

Pieter Abbeel 

Very related question to what you just answered, Geoff, is from Chaitanya Joshi. Many people have shared anecdotes on how Professor Hinton's mind works in an analogical and intuitive manner, with an aversion to mathematics and proofs. Could Prof. Hinton elaborate on the roles of formalism versus intuition when going about research? 

 

Geoff Hinton 

I think there's room for more than one kind of person, so I sort of hate formalism. I love intuition. I love tinkering about on my Mac to see what works and what doesn't. I think it's very important to have foundational work to really understand the mathematical foundations of things. It's not what I do. It's good to have proofs. It's not what I do. I have a little test I give people. Suppose there were two talks going on at NIPS, at the same time and you had to decide which one to go to. One talk was about a really clever and elegant way, a new, totally new way of proving unknown results. And the other talk was about a new learning algorithm that seems to do amazing things, but nobody understands why. Now I know which talk I’d go to. And I know that the, it was easier to get the first paper accepted than the second one. But other people would really like to know new ways of proving things because that's what they think is really interesting. I'm not like that at all. But I actually think nearly all the progress in neural nets has not come from doing the math right. It's come from intuitive ideas. But later on, people do the math. 

 

Pieter Abbeel 

That definitely resonates with me. Guillermo Martinez Villar asks how do you transition from a background in psychology to the field of AI? And what would you suggest to young people considering doing the same? 

 

Geoff Hinton 

Okay so there is an interesting issue there. So when I was teaching at the U of T, if you looked at the undergraduates, there were a lot of computer science undergraduates who are very good. They're also cognitive science undergraduates who did minor work in computer science, but really cognitive scientists. And they typically weren't quite as good at the technical stuff, but they've gone on to do much better things because they had the interest in the issues. They really wanted to understand how cognition worked. So I'm thinking of people like Blake Richards and Tim Lillicrap, who've gone on to do great things. Because they knew what questions they wanted answered, whereas most computer scientists didn't. And for some reason I thought that was relevant to the question. Could you say the question again? 

 

Pieter Abbeel 

Oh, yeah, it is very relevant to the question. Let me tee it up again. How did you transition from a background in psychology to the field of AI? And what would you suggest to young people considering doing the same? 

 

Geoff Hinton 

I don't know. It's very hard to generalize. I had a very weird career where I started off doing physics and physiology in my first year at university. In fact I was the only student at Cambridge that year doing both physics and physiology. And then my math wasn't good enough for physics. And I wanted to know the meaning of life so I did philosophy. And developed strong antibodies. I'm going to do psychology. But I did have some quantitative background having done physics and physiology. So retrospectively, it was an interesting background. It didn't happen with any design. It just kind of happened. But I think you need to have questions that you're driven by and not just techniques. It's more important to have questions that really excite you and you’d do anything to find the answer than to just be very good at some technique. However, I wish I learned more from math when I was young. I wish I didn't find linear algebra complicated. 

 

Pieter Abbeel 

Next question is from Khalid Saifullah. How conscious do you think, if at all, are today's neural networks? 

 

Geoff Hinton 

I guess Ilya would say just a little bit and get lots of flak for saying that. I have a view about consciousness. So about a hundred years ago, if you ask people what distinguishes living things from dead things, they’d say, well, living things have vital force and dead things don't. And if you say what’s vital force? They’d say, well, it is what living things have. And then we developed biochemistry and we understood about how biochemical processes work. But since some people haven't talked about vital force, it's not that we don't have vital force. We still have vital force if we ever had it. It's just not a useful concept anymore because we understand in detail how things work at the biochemical level, and we understand that the organs break down when they don't get enough oxygen, and then you're dead, and then it all decays. And it's not like some vital force left the body and went to heaven, it’s that the biochemistry just packed up on you. So I think the same is going to be true of consciousness. I think consciousness is a pre-scientific concept and I think that's why people are very bad at defining it and everybody disagrees. And I don't have any use for it. There's there's many related concepts like, are you aware of what's going on in your surroundings? So if Muhammad Ali hits you on the chin, you're not aware of what's going on in your surroundings. And we use the word unconscious for that. That's one meaning of unconscious. But if I'm driving without thinking about what I'm doing, that's another meaning of unconscious. We have all these different meanings. And my view of consciousness is it's a kind of primitive attempt to understand us, to deal with what's going on in the mind by giving it a name and assuming it the sum essence that explains everything. And here’s a similar analogy for cars. If you don't understand much about cars, I can tell you how cars work. Cars have oomph, and some cars have more oomph than others. Like one of these Teslas with big batteries has a lot of oomph and a little mini doesn't have much, especially if it's old. And that's how cars work. Because some have more oomph than others. And obviously, if you want to understand cars, it's really important to understand oomph. Now, as soon as you get down to understanding oomph, you start understanding how engines work and torque and energy and how it's converted and all that stuff. But as soon as you start understanding that, you stop using the word oomph. And I think it's going to be us for consciousness. 

 

Pieter Abbeel 

I love this explanation, Geoff. Farzana Mary's Azad asks, ML once started with roots in human psychology. Do you see ML advancements, today, having the capacity to help better understand human psychology in the future? Like seeing people as neural networks or classifiers and their cognitive distortions similar to under-overfitting and so forth. 

 

Geoff Hinton 

Yes, I do. I strongly believe that. I strongly believe that when we eventually understand how the brain works, that's going to give us lots of psychological insight too. Just as understanding chemistry at the atomic level, understanding of how molecules bump into each other and what happens, gives us lots of insight into the gas laws. The fine level understanding is important and does give rise to understanding what's going on at high levels. And I think it's going to be very hard to get satisfactory explanations of a lot of things for high levels. Things like schizophrenia, for example, without understanding the details of how it all works. 

 

Pieter Abbeel 

Well, thank you, Geoff, for making the time for the additional Q&A section with questions from our audience. Wow. What a way to wrap up season two. Thanks so much for all the great questions for Geoff. And thanks for listening to the podcast. If you enjoy the show, please consider giving us a rating. And please recommend us to your friends and colleagues.