Alex Kendall on The Robot Brains Season 2 Episode 8
Transcript edited for clarity
Pieter Abbeel: Today's guest is Alex: Kendall, the co-founder and CEO of Wayve, the London-based company pioneering technology to enable autonomous vehicles to drive in complex never-seen-before environments. Alex is a world expert in deep learning and computer vision. Before founding Wayve, Alex was a research fellow at Cambridge University, where he earned his Ph.D. in Computer Vision and Robotics. His research received numerous awards for scientific impact and made significant contributions to the field of computer vision and A.I.. Wayve is building AV 2.0, a next generation autonomous driving system that can quickly and safely adapt to new driving domains anywhere in the world. We'll talk more about what AV 2.0 means exactly and the technical significance of it in our chat. Personally, I'm a big fan of Alex's work and very thankful to have had the opportunity to be a small investor in his company. Alex, so good to have you. Welcome to the show.
Alex Kendall: Thank you. I'm really excited to be here.
Pieter: So like so many great startups, Wayve was started in a garage, a garage in Cambridge specifically. Can you say a bit more? This is 2017. You're starting this company. Every company is trying to hire any AI and still much today, but you could have taken any job anywhere. You decide to start your own company. How do you come to that decision?
Alex: It was an exciting time. I just finished an amazing Ph.D. and experience where I got to work on some of the early, early stages of computer vision, being able to teach robots to see what was around them and so they could understand their environment in real time. And at the same time, we were just seeing work that, you know, yourselves and others have been doing and reinforcement learning that let machines make some really advanced decisions and problems that were a few years ago thought that would maybe decades away from being able to be solved. And so it was this sort of intersection of these two ideas of being able to make decisions in real time and being able to understand the world around us that led to this. This is a leap of thought that maybe we can build autonomous vehicles that can really make their own decisions and understand how they move around. And of course, we started in very modest ways. We did get a house, we got a garage. We had some, some very fun times building out computer equipment. The small bedroom had a four kilowatt server with 50 kWh, and that was our heating for the whole house we had. Cable was going out the window to the garage and our neighbors got very used to seeing our car drive around the block and slowly learn to drive. But it was bringing all these things together and sort of ultimately, the highlight of that seed stage of starting our team and our idea was being able to get a system live on the car that could learn on its own regard how to lane follow. At the time, the reinforcement learning ideas we'd seen in simulation needed tens of millions of episodes to learn to make decisions. And for us, getting a car on a quiet country road in Cambridge and having it initialized randomly. So it had no knowledge about the world and set it off driving down this road every time it made a mistake and drove off the road a safety driver so sitting in the front seat, I would grab the steering wheel and correct it and it wouldn't know what it did wrong. It would just from just 10 examples of that correction and could figure out the patterns in the data that lead it to be out of line follow. And with just 10 examples over a couple of minutes of training, it could in lane follow and drive up and down this road as many times as we liked. And so seeing a physical robot in front of us learn to drive in the physical world, not in simulation like we'd seen in the video game room, but actually embodied in the real world. For me, that was really exciting and gave us a glimpse of the future.
Pieter: And so you are on the roads in Cambridge, correcting this car periodically, and it's learning to drive follow lanes at the same time. I mean, this is very exciting, of course, but at the same time, it's 2017. Waymo has been around for a while. Aurora has been around for a while. We actually had Chris on the show in season one. Tesla, I've been at it for a while and then Andre Karpathy on the show earlier. And so what made you think that, you know, there is something that is not being done yet that you could go do here, given this big efforts that were already on their way?
Alex: Yeah, I was fortunate enough to spend some time in the San Francisco Bay Area, and I was familiar with all of the work that was going on there. I'd read and was inspired. By the topic, grand challenges that kicked off this first generation of autonomous vehicles, but I think it was a I mean, it was a number of things that that really led us to believe that we could build something that could do more advanced things and behave in more intelligent ways than than those systems could. And it was this idea of having a system that was intelligent enough. I mean, I was fortunate enough. As you say, there was a recruitment process at the time that were insane and I was able to see and demo the experiences of these traditional classical self-driving cars. But they were they weren't what I call embodied and intelligent. They weren't making decisions. That was the promise of autonomy. They weren't sort of coexisting with other human drivers and doing things in a natural way. And when you know, when I go and drive around London, there's there's so many different environments where these cars have to interact with each other, naturally, merge, engage, make unpredicted turns, even nudge their way through cyclists and pedestrians. And all of these kind of behaviors need a level of intelligence that lets you understand the dynamics of the world and the complexity of the world around you. I couldn't see a route for that traditional approach to be able to bridge that gap. You know, the more rules you hand code in the system, not not to mention the cost and the complexity of engineering time it requires, but I just couldn't see. And I tried in a previous period, tried to create rule based systems that could try and do that. I couldn't see that happening. I was fascinated and amazed with what the machine learning world was doing. And someone very close to me once said that in your career and I really believe this, you need to be basing on not what's possible today, but where the trends are going. And everything I was reading and seeing and immersing myself around just points to the fact that this will be the future. So I wanted to be part of creating
Pieter: As I'm hearing this, Alex, it's very interesting because a lot of people would say, you know, a more pure learning approach is going to be too hard. It's not going to work. And that's exactly why these efforts tend to have many of them have learning for for the vision system, for the perception system, but then non learning for other parts of the system. And I remember you and your co-founder talking with me, I think it was at NeurIPS in Montreal at the time. We were just getting started saying we think we can learn all the way, end to end learning. Can you say a bit more about that? Just can you unpack? What does it mean to do end to end learning for self-driving versus the typical approach followed in other places right now?
Alex: Yeah, sure. So the typical approach for autonomous driving is one where you break it down into a number of different components. You have perception where you extract the side of the world, where the road lanes are with other vehicles, you might track them. Then you have, you know, perception planning where you plan, how you want to interact with it and then control. We actually control the vehicle. You have other components like high definition maps that feed in prior knowledge about the world. And essentially, you have all of these different components that some together create an autonomous driving system. And the challenge there as well an individual component might use machine learning the way that that interface and put together is largely hand designed through human design interfaces. That has a number of challenges because you need to put large maybe 30 or 50 percent teams around each component. It requires a lot of areas that essentially compound as you go through the stack. What machine learning is able to do is and what the end to end approaches is where you replace the entire architecture with a giant one single giant neural network. And the beauty of this is you let data do the hard work. So by using data, you can train this to build representations and uncover behaviors that are more complex than we can hands-on as humans. More specifically, what it means is that we have a trainable neural network that goes from sensor input. So we have calibrated, you know, an accelerated sensor platform that feeds in sensor input that we then pass through this neural network and gives us a motion plan as an output. And we take that motion plan and we can then execute that on a vehicle in a way that's agnostic and scalable to many different use cases or vehicles or driving domains. But it's building that driving intelligence that into a neural network, which has the flexibility to to to really drive and understand its environment, like you might getting in a new vehicle or a new city or a new domain training that in a system that is powerful enough to learn that function that can map from sensory input to motion plan output. That's what we've spent the last last four years building,
Pieter: And it's so interesting to me that you're turning this into when you set out in 2017, that's where you're going to do. And just a few weeks ago, you released a paper describing AV 2.0. and how self-driving needs to be rethought and when I'm reading through it, it actually matches very closely with your vision you already had in 2017 of what AV should be. And so kind of curious, like you've been thinking this for a while, what triggered you to write this up? And what exactly is this AV 2.0 vision that you're now communicating with the wider world in a very explicit way?
Alex: I think that's been one of the most interesting things over the years is that we've had this, we have had this very consistent vision for a long time. A lot of startups pivot change and adjust. And we've certainly learned things along the way. But what we've done each year is come back with just more and more evidence that this is the approach that will scale. And at the same time, we've seen more and more evidence from the traditional approach that isn't scaling. That's giving us some pilots and some very limited deployment areas. But it's not that future that we've been promised from autonomy. So I guess the reason for starting to talk more about it is, I think it's it's really a function of the maturity of the technology. What we were doing in the first couple of years was really exploring and many different approaches for how this this might work. Everything from model free to model based reinforcement learning to imitation learning, a bunch of different families of algorithms that can produce such a system. AV 2.0 and we tried all of them and learned what didn't work well, what might scale in the next couple of years was really building out the infrastructure. One of the things that I have learned is that machine learning is really fundamentally a data engineering problem. It's programming by data and building the infrastructure to be able to train. The system has required a really sustained and and and I think, well, I'm very, very thankful for the amazing team I get to work with that has been able to build this out. It's not just a case of writing a few lines of code and training a neural network like you might do in a research setting, but it's producing the petabyte scale access of data that we have of video data to train the system, to run it through large distributed computing nodes and train it. That's what we then built out. Now we're in a position where we have what I'd say is a very mature prototype of the system, one that can demonstrate the full suite of driving behaviors that you need to drive in an urban environment. Traffic lights, roundabouts, stop signs, lane changing all of these kind of behaviors. And we're really excited that we've now in a position to actually trial this and we'll be on the roads next year and early 2020 to doing last mile grocery delivery in London, and you'll be able to get your groceries delivered by one of our autonomous vans. And so with that, we see the need to go and actually communicate and share what we're doing. I think it's really important that we are able to engage with both the scientific and and the technical communities and actually share what we're building your feedback and also leverage the collective intelligence of our entire research field to be able to push this forward the fastest because this truly is, I think, one of the biggest opportunities and problems of our generation. And I just really want to see it realized very quickly.
Pieter: Now you said there that people might see their groceries delivered by a Wayve A.I. system driving a van. Can you say more about that? What exactly is going on there? Because it seems for a lot of self-driving cars, it's pure R&D. And here it seems like we can watch it in action. And where do we order groceries to see you or your car deliver it to us?
Alex: Yeah, we're working with some wonderful partners here in the U.K. and essentially integrating our autonomy platform into their vehicle operations. And this will allow us to start testing the ability for them to deliver groceries and last mile setting. So the problem is really driving from a fulfillment center to your home to to deliver groceries. I think this is a really interesting first, really wrenching first use case for our system. But of course, the way that we're building it in the way that constructing it is that it is a general learning framework that can really learn from all kinds of data that it gets access to and has and built up the intelligence to be able to be deployed on many different platforms. Now the other fantastic thing about working with some of these large grocery retailers is that they have absolutely gigantic fleets of tens of thousands of vehicles. And by working with these vehicles, we can also get access to the training data we need to train. I think that the recipe that we've really seen where again and again and again in machine learning is how can you get large scale quantities of data, sufficient compute and a large enough neural network that has the right structure to capture and model the problem that you're trying to solve? And so putting these things together and the data that we're able to capture from these fleets. I think this is what's most interesting for this. It's probably worth mentioning that the nuances of the data, though, because yes, you can create a bunch of supervised data. We have humans that actually teach the car how to drive directly. Or you might like the example I gave before have safety drivers that correct it every time it makes a mistake. But the way to really scale these systems, I think, is with with self-supervised anywhere you're able to learn from. Just very large scale, unstructured data of the domain you're operating in. And that's what we do with with these fleet partners we work with, we deploy small amounts, so essentially our sensing platform on their vehicles and collect the data that they that they experience with their human driving fleet. And what that gives us is a very large scale access to to the data that we can observe how they drive and we can learn how the dynamics of the world works, how they drive around all the cases they experience every day where we see weird and wonderful things on the road that you know that our fleet might not not see itself. And having access that scale of data in the case and an experience, I think that's what really gives us the diversity and the magnitude of data to be to train such a system.
Pieter: Now I'm curious when you say self-supervised, of course, I'm wondering what kind of self-supervised are you putting in there?
Alex: Yeah, I think the key signals that we can get and these are things that I think the computer computer vision community has really, really come of age over the last few years are concepts of geometry, motion, future prediction. So self-supervised learning is as the concept of essentially using patterns in the data itself to learn representations that are ultimately useful for a for an in task, and that the ones that are really powerful in the driving domain are using some of the structure of the world around us to to understand motion geometry, as well as what what actually plays out and in these driving experiences to be able to teach our system to predict the future. And so understanding the motion of things and how they move and interact is really what this data is well-placed for.
Pieter: I think it's very interesting that you're starting with grocery deliveries. I think what we've seen in the self-driving space, at least from where I'm sitting, I'm curious what you think is done for a long time. It was an R&D thing. And then in the last few years, different companies have claimed different first adoption cases as the one that's in some sense the right one where it should start. So some companies will say truck trucking, highway trucking is the thing that will be commercialized first, and that's the way to bootstrap everything else. Others have said, Hey, we're going to deliver passengers like Waymo. They're driving passengers around, right? Others, like Nuro, have said, We're going to build a very small vehicle that can, you know, that's less likely to to harm anyone because it's so small and it maybe moves more slowly. It's not not a full car or definitely not a full truck. And so you specifically chose grocery deliveries, but with a van, not with a small vehicle, but with a van. What thinking went into that?
Alex: All of these different use cases, of course, autonomy is going to address all of them and I think really, really enhanced each of their respective use cases. But for me, I think the most impactful and interesting use case and the one that's really going to have the largest impact on society is the urban is operating in urban domains. That's where most of the populations are centers throughout the world. That's where I think most of the most of the energy and transportation really exists. And where we can have the biggest impact on availability, safety and and accessibility of transportation solutions, as well as sustainability and making this whole system more efficient. So it's really a desire to tackle that hardest problem with this and and more fundamentally move away from the approach we've seen in autonomous driving, which is, you know, for the first generation approaches, it's really trying to brute force and get something to work once. Can you pick the simplest environment going to Chandler in Arizona? And can you get something to work once in a place where the Sun always shines and you've got a very structured road environment? The problem I have with that is that you end up cutting corners and you end up building technology that fundamentally doesn't scale. And so for us, going straight to ride hailing and, you know, last mile delivery within urban environments really forces us to think very deeply about the problem. How can we design a system that really will scale? How can we build a business in a solution that will really impact and provide value to the world? So it's really from an assessment of what is the trajectory that technology is on, and we firmly believe that this is we've got a pathway that will lead us to address this and also making sure that we have a system that will scale every decision we make as a team is based around how will we get to scale, how we get to 100 cities first. And that's why we've started in the urban ride hailing and last mile delivery spaces.
Pieter: I think it's so intriguing, Alex because I think if five years ago, you know who had asked people, Where do you think, you know, self-driving efforts will launch their first services? I don't think too many people would have said center of London that that's where it's going to be happening first because. I mean, it's so much harder. But here you are saying that actually it being harder is a good thing. The harder it is, the better for your system to be developed in that kind of space.
Alex: Yeah, I mean, it's so much. I mean, the data you get and the diversity you get is so much more interesting. You know, when I'm driving on highways where you have to drive hundreds or thousands of miles to get an interesting scenario, we get huge cases every hundred meters. I mean, you take a road out of our office and immediately there's a roadworks scenario there. Then you have a roundabout, an interesting merge image scenario. All of these things are just on our doorstep. And for me, the most exciting thing is is when I get out in the vehicle every couple of weeks and see a new behavior for the first time, like I remember the first time I experienced it complete an unprotected turn. And just to put this in context, this system is driving on roads it's never seen before, and scenarios we have in hand coded how it should behave. We've just shown a lot of data, and the prevailing pattern in that data is that it should stop at red lights or it should give way to other vehicles. But we have been told it how and what to do. It's just uncover those patterns from the data itself and then to get in the car and see it embody those behaviors and experience in the real world of it. Know overtaking a double parked car or doing an unpredicted turn or navigating onto the wrong side of the road to get around a roadworks scenario. When it's it's when it's safe and and you know, for it to understand all of these complex things. And most importantly, for such a small team to be out of program the full suite of driving behaviors and do that through machine learning. For me, those moments are the most special and and it's really exciting to see around London
Pieter: And you shared a glimpse of that with us a few weeks ago. You had a post on Twitter showing 15 minutes of driving along streets where the car hadn't been before. And what I thought was really amazing here is that, well, first the car hadn't been there before, so couldn't rely on having pre mapped in detail. What's there? And second, that you release a video that is a continual stream, no cuts. Not just to highlight just a stream of what happens, which which to me is something we also do at covering. This is kind of showing uncut video of systems in action is really a way for people to understand what it can do or cannot do at times, of course. I mean, it gives an honest understanding of where things are at versus some highlights, and I thought it was really intriguing because it was navigating pretty complicated traffic. And it just did it, just like you described. I'm curious, how do you measure progress? Because I mean, those videos are great, but you need to have some more quantitative metrics to, of course. And how do you go about that?
Alex: Yeah, it is. Having the right measurement in places is so important. And if you think about what we've built, it's this this giant fleet learning system, a system that can take in data from our data collection vehicles or our fleet operations and just that into the cloud train on it with a curriculum that's actively evolving, validate it and then put it back to the road. But we need an objective function and machine learning. You need a lost function that you're trying to optimize. For now, in autonomous driving, many people tend to use miles or kilometers per intervention, but the challenge with that metric is that, well, like any like any reward, functional objective function it can be, it can be gamed. And I mean, there's many, many good examples of systems that game objective functions. I mean, some of the ones I like if you yeah, if you think about the concept of a robot vacuum cleaner and you give it a reward for collecting dust, it might collect all of the dust it can and get the maximum reward, then what does it do? You know, it could reward, hack and decide to knock down walls to create more dust, therefore getting more reward. But you've put something wrong in the system. Another famous example is you can think of many of these kind of examples, but with driving, it's it's so important to get that right so that you create the right object, a function that produces a system that can intelligently behave within society. For us, that's got to be the right combination of user feedback and making sure that we actually address what what we're trying to enhance the human humans that these these machines serve to do, but also also look at things on a per scenario basis. So when you think about your driving scenarios, you can break down, you are what you call your operational design domain, how you think about driving and to different factors, the static scenario, the dynamic scenario and the environmental conditions. And you can break down. And ultimately, there are about 14000 factors that you can break this down into is if you go very granular. Now what we've built is a system. Than that when you drive throughout the world, when our system arbitrarily drives and it can drive anywhere, it's got the intelligence to drive to places it hasn't seen before. When it drives, we're able to classify it at every state, every every time, four times along that drive. We're in that 14000 dimensional matrix that point of time. Is it going through a roundabout in the rain with cyclist beside you? We can do this through computer vision and through other databases that we have access to about things like weather, road environments and things like that, and the ability to automatically classify different times that we're going to where it is in that operational design that gives us the ability to understand performance at a very granular level and in a way that allows us to compare like for like. So we can compare the difficulty of routes and factor out the difficulty and actually compare. Because if you were one of the things in machine learning that you can get very used to is training and testing on it. On a static dataset on my Ph.D., I worked on a lot of computer vision datasets where you would train a model, whether it's stereo vision, semantic segmentation classification, you'd have this static static dataset where you just train and test on the same thing and you'd continually optimize that number to go up. When we go and test whether autonomous driving, when we go and train our system to drive, we might go and test one day and then go to another model the other day, and we want to know which one's better. But each time you drive, you get an entirely different set of scenarios on the road. Even if you drive the same road twice the weather is different, the traffic will be different. There might be roadworks, all of this kind of stuff. And so being able to classify every spot, every time state we go along into this, we can then start to compare like for like how it performs in similar environments. And if you test over about 100 kilometers for a new model, then you can actually start to get signal, compare things. Not only that, but we can target deployments to actually go out and and find those scenarios. We don't know how safe to drive a team. I really like driving instructors for our robots. We get them to go out and find the interventions, find that examples so that we can understand how to improve and we can learn from that signal. And so ultimately, to to go back to your original question on how do we measure performance? We look at performance within a target deployment domain. We look at performance on a how how successfully can we drive autonomously and in a way that's both safe, safe and trustworthy through through a particular part of that bit operational design domain? How what percentage of the time do we do that to an acceptable level?
Pieter: And that's really intriguing. And I mean, the contrast would be maybe the standard metric that's often reported in the past is the number of miles driven on average per human intervention. And that's what's I believe, required to report in California. If you're testing your vehicles in California, where a lot of the AV efforts are testing and this seems a much broader spectrum of evaluation that you're looking at than miles driven per intervention.
Alex: Yeah, that's right. Because miles driven for intervention are not equivalent. For example, if you drive within an environment, let's say you're on a highway in Arizona and it's sunny and you can see everything you can do many miles per intervention with the same level of driving intelligence compared to in London. When you have rain, you've got busy traffic everywhere. You might go a couple of blocks and in a scenario that your system can't drive through. So comparing them to this metric really isn't a fair comparison between them. And that's why I think you need to look at things on a personal basis. And it also covers up things like tele operation or tele assistance, when many systems will just dial it and have a human remotely drive the vehicle. And I think that also covers up the driving intelligence that we're trying to design to really unlock autonomy. You need a system that you can trust that is safe and is intelligent enough to really enhance our lives as humanity. And I don't know if we're having a system that every tens of kilometers needs a tele assistance system to dial in, then that doesn't achieve that for me.
Pieter: Now talking about full autonomy, you said that Wayve will power some of the deliveries that are going to happen now. When you say that when we would see such a way of powered delivery, van, is there still going to be a person in there?
Alex: Yeah. To begin with, we will have safety drivers, which to begin with will help teach our system and then validate that the system monitor system where we validate that it is safe and reliable and acceptably safe to to launch. But ultimately, these will be vehicles that won't have humans in them. And so that also brings up another big question with what is a human robot interaction? How do we actually get a society? How do we enter? Correct, with these robots, I'm hugely excited about that, and that's where I think I think there are massive questions to answer there and to learn about. If you think, well, one of the interesting things in autonomy for me is that I think this is one of the why I think this is one of the most important and pivotal problems for society today is that it's the first time machines and artificial intelligence is going to have a physical interface, a society, one that's not a, you know, an opt in. For example, if you wear augmented reality glasses or if you buy a vacuum cleaning robot, now those are opt in that are enclosed environments where where you subscribe to those those experiences or services. Autonomous driving this is something that is going to be driving on the roads and as a pedestrian on the footpath, you're not going to explicitly consent to it being there. It's just something that inherently you have to interact with and trust. And that's a huge moment in the history of technology. This is a time where we're going to have robots really going around our urban environments and and society in general and adding value to our lives in a way that that that we trust and enhances what we're doing. So I think that's that's a huge responsibility, a huge moment in history and one that really requires a massive, massive mess of capabilities in terms of intelligence and behavior. But most importantly, the way we interact with these vehicles, we've gotta get that right. There are going to be new behavioral norms, for example, these vehicles, while they will be very safe. We need to figure out how to actually interact with people, how to load and unload groceries or parcels, or how to let people in or out of those vehicles. If people want to get out of the vehicle at any time, how do they stop safely and how do we facilitate that experience? And even how do you communicate? I think I think it won't be too long until we see a natural language interface to communicate with these vehicles. How can you backseat drive them? How can you tell them where and how you want to drive? Maybe you want to get some way more assertively, or maybe you're more relaxed and you want to take a scenic route? I think all of these kind of things, you know, all of these kinds of things, I think with technology will come together much, much faster, much quicker than we expect. And that, for me, will produce the future that that you know, that we've been promised and that we'll see with autonomy.
Pieter: In fact, recently you wrote an op-ed in Wired commenting on, you know, how this transition is going to happen in the foreseeable future and what it'll entail to bring autonomous vehicles into our worlds. And the thing that really stands out to me is the question of how to build the trust, how to have a pedestrian on the street feel just as comfortable or ideally more comfortable with the self-driving car than with a car with a driver in it? And how do you see that path to start building that trust?
Alex: I think that's trust is really important for autonomy and for me, trust. Well, if I think about people I trust in my life, I think trust is built out over a long period of time of joint experiences where you meet or exceed expectations. And so break that down of how I feel when I trust others. What does that mean? I think that means that we need to be very transparent and clear with how we expect this technology to behave and what we should expect it to do for us as society. I think it means that the performance needs to be very high and we need to be accurate with what we do. I think there are a number of the assumptions people usually think when you usually make when it comes to trust for autonomy. And one of the assumptions is that we should be able to causally read reason about what we do and be transparent and why the system is making a decision. It is. And while there are many benefits to that, I don't think it is necessary, strictly necessary for for trust in these systems because I think you can get it with setting the right expectations and performing accurately to those. I think that there needs to be accountability, though when, if and if unfortunately something were to go wrong, then there needs to be some accountability and recourse to improving the system. But you can. You can draw many analogies here to other technologies that we trust within our lives. Like, when you get on an airplane, you know you trust to be taken from A to B, but you, you know you don't. A lot of people won't understand the actual aerodynamics of the aircraft or the way the avionics control the aircraft, but it meets expectations and it's accurate and safe, certainly in this day and age. But in that, I guess the difference when you think about autonomous driving is that prior to autonomous driving, it's something that a lot of us as humans do. And it's something we're very familiar with. You know, not many of us are pilots, but many of us do actually drive vehicles. And so we have a higher understanding of that process and expectations. And so there, I think it's very important to communicate both. Expectations, the value of the system brings and and then ultimately be very confident, evidence driven in how and why the system will behave that way. I think that's a recipe for building a system that ultimately will build up long term value. The failure modes are really where you start to overpromise or when you start to start to put it in situations where it's perhaps not designed for or out of its domain. I think we've got to be very careful as an industry of making sure that we are transparent with the capabilities of the technology and also focused on continually learning, adapting and improving the technology. As we learn more about the impact and the needs of society over time.
Pieter: One of the big debates related to self-driving is whether cameras are enough or whether you also need other sensory modalities to have a reliable, trustable self-driving car. And so I'm curious, where do you stand, Alex? And where is Wayve in the current process?
Alex: So both my personal and Wayve’s position on sensing is that we are agnostic to the sensing platform that we will use. We will use the most safe and scalable sensing platform that is available because ultimately the technology we've built can learn to drive based on data and and if you change a sensing platform, it's just a matter of getting driving data with that new sensing modality and learning to drive and learning the patterns in that data to drive from that from that sensor. Having said that, though, it's got to be able to scale. And for me, camera sensors and a camera first approach has all of the visibility that you need and all of the signals you need to drive that passive sensors, ones that are already manufactured at scale today and sensors we understand very well. So I think from a safety perspective, from an information content perspective and from a scalability perspective there is nothing today. It's not to say there won't be new sensing types that are invented, the future that can go beyond. And of course, adding more sensors strictly gives you more information as long as your system is very good at being attentive to the right signal and can deal with perhaps a lower signal to noise ratio as you add more sensors. Then, of course, adding more sensors when it is affordable and scalable and safe to do so well should theoretically improve the system. But for us, cameras give us everything about an appearance motion geometry, not to mention, not to mention, the fact that there is a great existence proof that humans can do many intelligent things with the visuals, with the visual spectrum as our input alone. Many people also overlook the fact that there's inertial sensors, GPS microphones and autonomous vehicles, and so those are important as well. But the bulk of the intelligent decision making that our vehicle makes and how I think autonomy can be scaled is with the visual spectrum.
Pieter: Now, as you look at the trajectory of the company, I heard you had some, some recent exciting milestones. Can you say a bit about that?
Alex: There've been lots of really exciting things that our team has gone through this year. I could list a bunch of them, but one of them you may be referring to is the fact that one of the core promises of our system of end-to-end AV technology is that it can learn to drive in a way that really generalizes and really scales. And so this is something that we set a really bold mission to go and achieve and go and demonstrate this here, not just this theory, but in practice. So at the start of the year, we set out a mission to go and take this technology and demonstrate that it could drive in multiple cities and really scale. And so what we did over the last couple of months as we developed the system as a trained and trained and developed it on London training data. And so we've just in the last couple of months, we've gone out to six different U.K. cities, big cities like Cambridge, Manchester, Leeds, Coventry and have shown that we can drive in these new cities despite having never been there before and never having seen data from there before. Not having access to an HD map of these cities before. This is fantastic, right? Because this even surprised me, I thought we'd have to go to these cities and get a little bit of local training data to adapt to those new distributions. But. It turns out that the neural networks that we train our intelligence not to generalize from London data to these different types of cities and these cities experience different, whether they have different road layouts. For example, Cambridge has many cobblestone streets. It's a university town, there's lots of cyclists around or you go to Manchester in the north and it gets hard around the football stadium there and you get different multi-agent behaviors of a lot of the fans around Manchester United. And so there's many different scenarios, and for us to have a system that was intense enough to generalize these new cities and get similar performance metrics in these cities, as we see in London, was really exciting and talking about long, continuous, uninterrupted episodes of autonomy. We could see it drive for 10 minutes, doing amazing and amazingly complex scenarios in the heart of these cities and do it all on a zero shot setting without any training data actually from that city itself.
Pieter: Well, that must have been quite the feeling because that's in some sense a full validation of everything you've been thinking of how it would go. That training in London is the right place to get your training data, and it just worked. What was the first day like?
Alex: It was a really exciting one. I remember when operations team were absolutely fantastic on the front line with the vehicles. I remember when they took it up to the first city to Cambridge, where we started the company and we went in and drove around the end and got some early data. And I was just clicking refresh on our on a console that gives us access to the data replay when the data was being ingested and waiting for the videos coming up, looking at the the experience and performance. And and for me, that was hugely exciting because you're right, it validates things that we can build a system that can learn and drive in London and focus on getting the system to game the system to launch. And once we've done that, it will horizontally scale and effortlessly scale with data globally. And for me, that's it's absolutely validation of of of our approach and our ability to build autonomy for everyone everywhere.
Pieter: Now you may also have seen another milestone that was achieved in self-driving recently by a cruise team as they just got a permit to have driverless rides to give drivers. Sorry. Let me read that up. So another big milestone in driverless space as a whole is that Cruise GM just got a permit to give driverless rides in San Francisco. And I'm curious about your thoughts on that.
Alex: I think it's a really exciting milestone. I think both Cruise and Waymo have just received their permit, and it's a fantastic one for the industry because look, we've seen over hundred billion dollars invested in the space and it's created this entire industry. It's put in place regulation and legislation that's put in place education and demand for this technology. And I think this is, you know, we're seeing these first tests of legislative processes to to actually enable this technology now. You know, we've explained why this this this approach, the traditional approach to autonomy isn't scaling with the fact that we're stress testing all of these processes and we're standing up things from zero to one for the first time is fantastic. And I think there's a lot of, you know, a lot of really interesting problems to solve outside the technology itself. So I'm really pleased to see that the growing maturity in the surrounding ecosystem around the autonomy and now we just need a technology that can really, really behave and drive in complex areas and deliver it to the world.
Pieter: Now take us a little back in time, Alex. You weren't born running a self-driving company. As I understand it, you grew up in New Zealand. Is that right?
Alex: Yes. In Christchurch, in the South Island.
Pieter: And growing up in New Zealand, how did you get intrigued by artificial intelligence and how did you end up in the UK working on it?
Alex: Well, this is a long story and not so long story, depending on how you look at it. But it was a fascination with many different systems and machines and a child of growing up building tree huts, playing with Lego, making computer games and also exploring the amazing natural world that we're lucky enough to have in New Zealand. And I think this sense of adventure that particularly my family and my mum and dad really instilled in me, I think set me up with the right values and principles to, you know, to really get excited about about growing and building this kind of technology. But it was, I think, late or early in my university days. So I went to the University of Auckland and studied mechatronics. Engineering was an amazing course because it sort of brushed the surface of mechanical, electrical and software engineering and was a really generalist engineering course. And I was fascinated with machines that had all of this complexity in them. And for me, that was robotics. It's really the intersection of all of these different disciplines. It's a big multidisciplinary problem, not to mention disciplines outside engineering as well. But building and working on cars is very complex and capital intensive. And so I started off working on drones and that was technology that I could get my hands on. You know, you could build a model student budget, and I had a lot of fun putting together drones before you could buy them off the shelf. But we had to put together a parts program with a basic pad controller to keep it stable. I put together one and I used to love going to our farm in Christchurch and chase sheep and other things around with these drones. But yeah, putting these machines together. And I think just the fascination with all the challenges. And I think it was seeing them do things that were just inspiring or amazing to me as a human led me down the path of exciting excitement about robotics. But it wasn't until, well, I was fortunate enough. There's an amazing scholarship scheme in New Zealand that sends people to Cambridge University called the Woolf Fisher Scholarship, but I was fortunate enough to receive and sent me to Cambridge which was another eye opening experience across the world and a completely different culture that I really relished. But I went there to study control theory and originally write papers which explicitly modeled out the world using, you know, analytical equations. And it wasn't until I started to read about machine learning and in particular in the computer business space and get inspired by some of the people and and my particularly some of my supervisor about Poehler's work that I started to see that machine learning and statistics could understand things that were were far more complex than than what would hardcoded. And I think it was really coming into that field afresh. You know, I'd never studied machine learning before and coming into it without these preconceived biases of how things are done before, it really gave me a head start from the get-go. And I think one thing I've been sort of referencing throughout here was this theme of exploration and whether it was in the mountains of New Zealand or going to new places that was really caught in my life. And in particular, looking at the architecture around Cambridge, I was absolutely fascinated with the geometry and the structures of these beautiful buildings, especially in Trinity College where I studied. And so I actually got into computer vision because I found the structure for motion technology. We can create these amazing 3D models of these spaces, and I walked around the city with my phone filming monocular video and running it on university servers over the weekends to create this big structure for motion models. That was my excitement about that and the problem of simultaneous localization and mapping that really led me down into computer vision or robotics. And I should probably pause there, but I just bring back some wonderful memories about about the people I got to meet and work with at stage. But from there was when I stepped into machine learning and discovered really how we could use learning to to accelerate what we can understand and develop by orders of magnitude.
Pieter: Now from there you started Wayve. It’s going really, really well, you're building the future of mobility, but what does it look like when you look into the future? Five, 10, 20 years from now, what do cities look like? What does the world look like if you are successful?
Alex: I think embodied autonomy, embodied intelligence, autonomous mobile robotics is going to be the next wave of computing. This is going to be as transformative as what perhaps the personal computer or the iPhone was. It's a new paradigm where we have these mobile robots that move around the world that connected. They have the sensing on board to be able to make intelligent decisions and interact with us in a way that is safe, sustainable, reliable and and ultimately lets us do things more effectively. This is a really interesting world because it provides so much opportunity. It provides opportunity for us to be able to move people and goods around more effortlessly, more sustainably, but also provides absolutely improve the safety of our transportation and society. But in general, I think it provides us this platform to be able to build levels of intelligence that can really allow us to uncover and understand more about more about how. What we do as a society and how this works, and I think this is going to be, I can't wait to see what this uncovers. I think it will be absolutely transformative.
Pieter: Well, Alex, thank you for the wonderful conversation. It's so nice to catch up. Can't wait for my chance. Hopefully in the near future to make it out to London and maybe try a ride in one of the latest cars that you have there. Thanks again for being on.
Alex: Thank you so much, Peter. I can't wait for you and for the whole world to ride the Wayve soon.