Cloud OnAir: Robot Games: How to Build a System to See, Understand, & React to the World

[MUSIC PLAYING] DAVE ELLIOTT: Welcome to Cloud OnAir, live webinars from Google Cloud We host these webinars every Tuesday So welcome Today, we are going to talk through robot games, how to use TensorFlow to build a system that can see, understand, and react to the world As a reminder, you can ask questions anytime on the platform And we’ve got Googlers standing by to answer the questions So let’s talk about robot games, how to build this system to see, understand, and react to the real world So to start off, I’m Dave Elliott, and I work in Google’s Cloud And I spend most my time working on machine learning and artificial intelligence NOAH NEGREY: As do I And I also work in Cloud I’m a developer programs engineer working with the Cloud AI teams DAVE ELLIOTT: So, great Let’s go ahead and jump straight in So this is what we built. This is a project that we worked on for a few months earlier this year And this is what it looked like in its final manifestation So what you’re looking at here is four arenas of what we called AI in Motion, which is these robot games And let’s take a closer look at what this looked like It was the individual standing around this arena This is one of the arenas You can see the robots– the balls themselves– you can see the obstacles, and people interacting and having lots of fun interacting with them And so what we wanted to be able to do was to take this project that Noah and I and a few other folks worked on, and make it available for anybody to go and build yourselves, and go and have some fun with your kids or your significant other So let’s go ahead and take a look at what this looks like in action itself So this is what the AI system sees And this is how it’s seeing, understanding, and reacting to the world This probably won’t make a lot of sense right now But hopefully by the end of this webinar, you’ll be able to understand what’s happening here, and again, to be able to go and do this yourself Maybe you can get some of these robots, and go have some fun at the holiday season OK so let’s talk about what we’re going to– actually, right up front, what I wanted to be able to share is that all of these resources are in fact available for you today So you can go to GitHub and look up Cloud Platform Next, AI in Motion And all the source code is available for you I also wanted to quickly call out the hardware It’s this relatively inexpensive, easy-to-find hardware So we used the Sphero Robotic Balls, the SPRK+ in fact And then we used Pixel and Huawei phones We actually showed this demo in London, San Francisco, and Tokyo And so you can use both the Pixel and the Huawei P20 OK, so here’s what we’re going to walk through today First, the design and architecture decisions we made And then Noah’s going to walk through the model selection and training, and the on-device deployment And then we’ll finish with Q & A And again, you can ask questions anytime We’ve got Googlers standing by to answer And Noah and I will answer what we can NOAH NEGREY: Yeah, at the end DAVE ELLIOTT: All right, so let’s go ahead and jump into the design architecture So the design goals– we had four basic design goals when we started this project The first was we wanted to be able to leverage the cloud and the real world So we want to be able to take the cloud and use the cloud for complex training– intensive and complex– but we also want to be able to deploy this in the real world, actually onto robots themselves, not just on a game on the internet or something The next is we want to leverage existing tools So great things are built on the shoulders of giants We wanted to be able to leverage those tools and so we can move quickly and get results quickly We also wanted to use as much automation as possible so that we can sort of lower the cost of training, which is one of the most significant costs of machine learning And then finally, we wanted to have fun We wanted this to be as much fun as it was educational And that’s why we created these robotic games The key architecture decisions we made through the process– first, I mentioned earlier, cloud training So we wanted to be able to use the cloud for rapid iteration in order to improve our models quickly, to be able to lower the cost of using the latest cutting-edge hardware– TPUs and GPUs, for instance, and to really only pay for what we use The second architectural decision we made was we wanted to make sure that we had focus on the models or optimization of the models So what that meant ultimately was we ended up with two separate models– one model to be able to see and understand the world, which Noah will walk through in a minute, for object detection And the second is we wanted to have

a model that gave the right path for direction for the gameplay And that ended up– we called it a commander model NOAH NEGREY: So Dave, we mentioned at the beginning that we both are on the Cloud team So why are we focusing on-device prediction? DAVE ELLIOTT: So it’s a good question But to me, I think it’s because of the emerging use cases that are really interesting, the disconnected world of devices, like a warehouse and you’re doing inventory management Or a manufacturing facility, a plant, you want to look at defects in what’s being built, or security So you’ve got these the disconnected, not-always-on devices, and you want to be able to do the prediction in realtime with these disconnected devices So low latency, disconnected, and predictions, to mimic the real world The hardware decisions were pretty straightforward We ended up looking at a bunch of really inexpensive and fun robots We selected the Sphero robots They’re really easy to work with, a very simple set of APIs And we also used the Pixel 2 and the 2XL And I mentioned the Huawei P20 phone Really, it could have been any Android devices These are just ones that were easily available and easy to work with The machine learning process– this is where I think Noah will walk through the second half of this webinar If you’re not familiar with machine learning, I highly recommend taking a look at “AI Adventures” from Yufeng Guo, “The 7 Steps of Machine Learning.” I’ve summarized it here– the process we went through– to four steps We first gathered and prepared the data Remember, machine learning is really about using examples– or in this case, examples are the data– in order to make predictions So we simulated some data And Noah will walk through that again Choosing a model– and as I mentioned before, we ended up with two models, one for object detection, and one for– the commander for making decisions on directions The third step is really the heart of it– the training, the evaluation, and the tuning of the model And then finally, this on-device prediction, actually being able to use these inexpensive, easily available, disconnected devices to make predictions So this is robot games And we looked at a lot of different games that we could have played, and we settled on these four games And they really are sort of a snapshot of chaser games We created these four games in particular The first was Bot Freeze Tag And this is where, working together, humans compete in a game of freeze tag against an individual AI robot An individual robotic-controlled robot would touch each player, freeze them, and individual humans would have to run around and unfreeze each other Then we switched it around, and we created Human Freeze Tag, where there was an individual human trying to freeze the AI-controlled robots, and AI-controlled robots would unfreeze each other The third one is what we called Zombie Apocalypse And this was a little bit like the freeze tag, except in this case, it was– actually, this is one of the most popular ones– when you’re tagged, the human becomes a part of the model that chases or pursues the other human players And then finally we played a traditional game of Capture the Flag And you scored based upon how quickly you’re able to capture the other flag And I’m running over just designated points in the game OK, so those were the architecture decisions we made, the game design And at this stage, I’m going to had it over to Noah, who’s going to walk through model selection and training NOAH NEGREY: All right, sweet Let’s jump in So as Dave was saying, the first step is data gathering and preparation process And so the key thing for us is, we didn’t have a large data set already of Sphero robots, with images and labels So we needed to generate that data set And going through, we could do it by hand, taking pictures, going back, manually labeling them But that would take forever So we wanted to generate our data set And so what we have here is, on that top image, this blank canvas image of our arena floor, and then a cropped image on the right And so we start placing these randomly around the image What you see on the bottom is these randomly placed objects And when we randomly place them with a simple Python script using the Pillow library, we also get the coordinates of where they are and the type that we placed So this is sort of our feature and label data set And so then, now that we’ve generated this data set, we can generate like 1,200 images and convert it to a TensorFlow record And so the idea behind that is that TensorFlow uses these TF records to quickly access data– as well as the labels during training– to help speed up the process So we’ll talk about the conversion script a little bit later, and how we used it

But after that, we upload our data up to Google Cloud Storage so that we can store it Because this is 1,200 images– large, several gigabytes And so we need to be able to access it quickly And if we want to run multiple models at once, we want to make sure they can all access that data at the same time DAVE ELLIOTT: So two things I’ll say that that One is– this goes back to what I’ve mentioned before– we want to leverage automation And so these are virtual The 1,200 items that Noah mentioned, these are virtual items It could have been 12,000 It could have been 120,000 So we were trying to leverage automation as much as possible And the second thing is, it might make sense for us to explain what TensorFlow is since TensorFlow is at the heart of what we used here NOAH NEGREY: So TensorFlow is a machine learning framework So there’s a bunch of different frameworks out there– Scikit-learn, PyTorch– but TensorFlow is what we used, also because of TensorFlow Lite, which we’ll dig into later, that has a great use for embedded and mobile devices So that’s a sort of a library in Python and other languages– Node.js [INAUDIBLE] JavaScript But let’s jump in now So we’re going to be talking about the two different models we had So we have the object detection model This is our understanding model At the beginning, we said we can see, understand, and react So our object detection model is where we do our understanding And we’re also going be talking about our commander model in tandem here, about how we react to the environment So when we first started, we kind of looked around, leveraging out the resources we had, to see what things out there that we could use And right from the beginning, TensorFlow has, on their GitHub, simple Android applications that go and provide, basically, a simple app that already does object detection with a model in there to do keyboard recognition, general person recognition, maybe like water bottle recognition, very simple things, and shows you how to run TensorFlow on an Android app So we grabbed that off of their GitHub repo, and just started with that But the problem with that was we didn’t get our custom sort of prediction of our Sphero robots from that data set that we generated So then we had to decide, hey, we need a custom sort of object detection model, and that– DAVE ELLIOTT: You’re saying that you have millions of pictures of Sphero robots– NOAH NEGREY: No, they didn’t– DAVE ELLIOTT: –already in the model NOAH NEGREY: Oddly enough, no So we had to build our own model And so but the nice thing was, we could take– TensorFlow has this thing, what they call the Object Detection API And that’s going to be that bottom resource, the MobileNets Open-Source Models for Efficient Vision On-Device And so what that is is basically there’s these pre-trained models, frozen at checkpoints, that do object detection very well But what they don’t have is very specific object detection And so we can take our data set with that model, run it through training by modifying just some of the configuration files, and get a brand new model out that is now trained to do object detection on our models And the code for that can sound intimidating You’re like whoa, I don’t know how to do this object detection But let’s look at the next slide, and just see sort of what we have to modify And so what we take here is we see, modifying just a few lines in the configuration file of the input path and the label map path And what’s happening there is we just are modifying that TF record that we uploaded to Google Cloud Storage, and adding those in And then we run the scripts that we got as part of this model to run the training job So next, let’s talk about some of those decisions we made for the commander model That’s the reacting model And so the idea behind this model is we’ve got the bounding boxes that you see in that image there We’ve figured out where things are on the image We’ve understood our environment But now let’s make a decision And that’s where this model’s coming in, more built from scratch, to make a decision on what the model should do And so this model is using reinforcement learning And this is a technique where you’re basically playing a bunch of games over and over, and then giving it a scoring system to see how well it did, and just learning from that scoring system Because we don’t really know what the optimal path is in every situation when the balls are actually out there moving around And so that’s how we used the commander model DAVE ELLIOTT: So we’ve got the object detection to see the world, to understand the world, these objects and the obstacles that are out there And then commander model is the one that actually plays the game, the one that knows the rules of the game and makes the determination as to what the right path is in order for the [INAUDIBLE] NOAH NEGREY: Yeah, well, not quite It doesn’t really need to know the rules of the game It just needs to know– we can teach it the rules of the game later, with reinforcement learning, to play using just– DAVE ELLIOTT: It doesn’t know the rules of the game, but it knows what’s a successful path NOAH NEGREY: Yes We give it back that scoring while training We’re telling it what the correct path is So now the next part is actually training this model, and running it, and learning from it, and evaluating And so the key thing to that is we can take advantage of cloud resources, using Google Cloud Machine Learning Engine for training of both models And so the idea behind this is, when we’re training these models, we need a lot of compute resources And we also have to access the data And so this is the nice thing that Google offers, is we have all these resources and compute

We can run many simulations at once, many training jobs, and be able to access that data out of the TensorFlow record And so the key thing to this is there are simple one-time setups for this cloud training stuff And you can follow the links to those GitHubs DAVE ELLIOTT: About how long did it take to do this? NOAH NEGREY: The simple one-time setup? Oh, 15, maybe 20 minutes And we’ll talk in the next slide about a resource we used that basically tells you how to do this whole process with a slightly more specific model, in about 30 minutes, for object detection DAVE ELLIOTT: OK, great NOAH NEGREY: So let’s look at that We’ve got our sample application we have that we just need to get a model for We’re going to train our custom object detection model And now we just need to sort of start that training job And so there’s actually three amazing resources out there, these three blog posts– Sara Robinson is the author of the first and third one, Dat Tran is the author of the second– that basically walk through these exact steps with very simple specific resources on doing object detection As you can see on the bottom one, it’s getting started in 30 minutes So you’re up and running with a mobile application, doing object detection on your phone in 30 minutes with a cloud training job, using your cloud TPUs– kind of amazing DAVE ELLIOTT: And this is another example of using resources that exist in the TensorFlow community and the broader community to be able to shorten the time to get to actually the gameplay itself NOAH NEGREY: Yeah, definitely So then what we talked about is our second model, the commander model, being run in simulation So what you see right there is our actual simulation happening So our model is using simulations so that we can run it faster, run multiple training jobs It’s less expensive We don’t need to actually run multiple games in the physical environment, keep logging the data, and then go back and sort of upload that, and train from that No, we can use this simulation data to get near realtime sort of what happens in the real world And the idea behind this is that the red ball is the one doing reinforcement learning So its sort of objective is to try to move toward that target point, the yellow ball, which is stationary And it’s trying to move around and get to that And then the blue ball is our chaser, sort of just doing a simple scripted thing to directly after the red ball And the red ball is trying to learn from that And the idea of this is it’s a “target point” and “point to avoid” structure of initial model simulations, that we can create our chaser and runner from the same model And the idea from our experimentation led to this “peak and valley” model, what is more efficient and sort of realistic for our runner than other things we did And so the idea behind this is, imagine that the blue ball is this peak of a mountain which moves And this yellow is this pit that we can either keep stationary or move around And red is sort of being pulled towards the yellow pit DAVE ELLIOTT: Kind of like gravity NOAH NEGREY: Yeah, gravity, sort of falling away from the blue ball And this is sort of a common-ish technique in robot pathfinding And to make this a little bit more realistic, we also added a aggressiveness rating And the idea behind this was we would sort of weight how much we wanted to fall away from blue, or sort of more push into yellow And so this allowed us to create two behaviors When used as the chaser, you can move the target point around to the nearest Sphero robot that you’re trying to chase after, and the nearest obstacle being one of those white blocks you want to avoid And when the runner, you can kind of keep this stationary general area you want to move towards, with the primary objective being to run away from the nearest blue ball or the nearest chasing robot DAVE ELLIOTT: And so this is really the heart of the commander model, which is really the heart of the engine that learned to play these games, right? NOAH NEGREY: Yeah DAVE ELLIOTT: And on top of that, this also points out again the example of– or the use of automation, that we want to be able to– if we’d have done this in the real world, you need lots and lots of examples to be able to train your models This would have taken weeks or months, and cost hundreds of thousands, maybe, to have the many, many iterations of this What you’ve done here is essentially simulate those games NOAH NEGREY: Yeah And one thing you might be thinking is like, OK, for chasing, why not just use a search algorithm like A* or Optimized And that’s true You can totally do that We started with that in our prototype, and it works really well But then when we wanted to make that running away decision, you have to know where you want to go And that’s that optimal decision we don’t exactly know the answer to, where we decided to create this model and use it in both cases of chasing and running, which created some very interesting behaviors we saw out on the actual arena floor So let’s kind of do a little recap of all the different tools we used in this training, and sort of decision-making process So first up, going back to that beginning, Cloud Storage,

using Google Cloud Storage to hold our TensorFlow record of 1,200 images or so in our data set, and slowly increasing throughout time as we generate more data And then TensorFlow, that machine learning framework, again, that we’re using because the TensorFlow object Detection API has all this vast resources that we have out there to use, as well as the TensorFlow Lite, which we’ll be talking about shortly Cloud Machine Learning Engine– we use that to run many simulation jobs, speed up our training using GPUs and TPUs And the cool thing with that is, by doing that, we can speed up our training to sort of cut down our training time, from running in about eight hours with Cloud TPU, down to about an hour and a half to two hours, that we can iterate multiple times in a day instead of just once a day DAVE ELLIOTT: That’s from GPU to TPU NOAH NEGREY: Yes DAVE ELLIOTT: And TPUs are the TensorFlow Processing Units These are custom-built ASICs that are specifically designed to run in Google’s Cloud to do the types of math that’s required– the matrix type of math that’s required to do the training Training is the most intensive part of the machine learning process, at least from a compute perspective And so having this custom-built ASICs that really increase the speed and lower the cost of training is a really important part of being able to do this in a quick and inexpensive way NOAH NEGREY: Yeah And then we were able to use the Google Compute engines to run our simulations on that commander model, by running multiple simulations at once so we aren’t just focusing all our resources locally So next up, let’s talk about the on-device deployment So we have our models trained, selected, and we’re ready to use them We just need to put them into action And so that’s where we finally get to TensorFlow Lite And so TensorFlow Lite is a solution for running machine learning models on mobile and embedded devices And the reason for that is TensorFlow Lite has this cool quantization feature And the idea behind that is that, for these deep learning models, these models that are very large, you can shrink the model size by removing certain layers And you’re trading off accuracy to get smaller models that run faster And this is really important in mobile And the reason for that is mobile applications are typically maybe 30 megabytes in size And if you add on a 30 megabyte model size, that’s a bad thing It doubles your size And I believe there’s requirements size on these applications So by cutting them down to about 2 megabytes, you lose maybe a percent or two of accuracy, but what you get is a significantly faster model, cutting, potentially, your inference time– your prediction time, what it takes to do the object detection– in half to a smaller model And so this enables the machine learning inference with low latency and a small binary size, which can run on Android, iOS, and other operating systems And by converting to TensorFlow Lite, there’s already existing scripts to take your TensorFlow model as is and convert it to TensorFlow Lite And so that’s again where we point out to that blog post of training and serving a realtime mobile object detector in 30 minutes with Cloud TPUs It will help you run that conversion script for you DAVE ELLIOTT: And this is another one of the– as you mentioned before, one of the key reasons that we chose TensorFlow Not only is TensorFlow wildly popular, so there’s lots of resources– I think it was open-sourced in October of 2015, and at this stage, it’s by far the most popular open source library out there– but also the fact that TensorFlow Lite was designed for use cases just like this NOAH NEGREY: Yes Yeah, for mobile or in embedded devices So now we kind of have this diagram flowing We have our training in the cloud, those resources that we went over, and then we’re converting it to TensorFlow Lite Now we just need to put it on the Android application And so this we’re going to be serving on-device and talking to our Sphero robots So we’re talking to the Sphero robots over their Bluetooth with their SDK And so the idea of this model deployment is there’s sort of a loop happening What happens is the object detection model informs the commander model So when it goes and does the object detection, it’s understanding where things are and then sending that information out to the commander model The commander model is then taking that information and making a prediction It’s saying, which angle should I go, what direction And the key thing is, the Sphero robot’s SDK allows you to input an angle of 360 degrees And our commander model is making a decision in 20-angle-degree increments So it’s saying, cool, go this way, go that way And so that information then gets sent to our robots And so then this loop again repeats We do the object detection, we observe– DAVE ELLIOTT: And the 20 angles was just arbitrary– NOAH NEGREY: Yeah, relatively arbitrary We just wanted to slim it down because we didn’t quite need full 360-degree support 20-degree angles was enough of a small percentage to get good behavior out DAVE ELLIOTT: Somebody else can do it in different increments if they need more precision NOAH NEGREY: Yeah DAVE ELLIOTT: Right, OK NOAH NEGREY: Again, part of the iterating and sort of process of trying something, seeing

how it works once it’s out on the world, and then bringing it back in And so in general, this whole loop of running object detection and commander model takes about 60 to 80 milliseconds That’s the inference time slash decision-making time And that is near-ish realtime to happening on the arena, where, as the balls are moving, we understand where they are, as we saw in that initial video– where they’re moving around, and then making a decision based on that information DAVE ELLIOTT: So that’s the scene, the understanding and the reacting, 60 to 80 milliseconds NOAH NEGREY: Yeah, all of them DAVE ELLIOTT: Which again, seems like it would be fantastic in the physical world If it’s an online game or something like that, that might be problematic But here, it’s actually seeing the direction and the obstacles, and making determinations in 60-80 milliseconds NOAH NEGREY: Yeah So now we can see that whole diagram coming together But then the last piece that’s missing is that multiplayer aspect that we said, where people– DAVE ELLIOTT: The humans NOAH NEGREY: –yeah, the humans are coming in and interacting, and playing with these AI robots So that’s where we added this second part using Firebase And Firebase is Google’s sort of mobile platform for cloud And the idea behind this is we can have our devices communicating And we use the Firebase Realtime Database And the idea behind this is that that realtime database is holding our game state and information of who is “it,” who is allowed to move, if you got tagged and are frozen, or if you turned into a zombie And so we’re using that realtime database with low latency to be able to update that game state as things are happening, so that the central phone running the object detection in commander model is actually also running sort of the general game state, and sending information off to the Firebase And the other devices that the humans walk up to and we start playing with to interact with our robots get that update in realtime DAVE ELLIOTT: And in this case, the human players are using phones as well NOAH NEGREY: Yes DAVE ELLIOTT: With a simple app on it And they could’ve been using joysticks or any other input device And I guess it probably makes sense to leave this up just for a minute longer Because this is really the snapshot of the whole system On the far left side, you’ve got the training in the cloud that I mentioned before as one of the key tenets of what we wanted to do You can see the storage and the training using TensorFlow, Google Cloud Storage, Compute Engine, TPUs The conversion– that happens in the middle, TensorFlow and TensorFlow Lite And then off to the right side, serving on-device with both of our models, the object detection, the scene, and the commander models are really the right model for the right job at hand And then the Sphero robots themselves, that are both being seen by object detection and being controlled through the commander model And then the bottom being just the gameplay, the human interaction component of it NOAH NEGREY: Yeah, exactly DAVE ELLIOTT: Great NOAH NEGREY: So let’s jump back to that initial video and sort of walk through it again now that we have it So the idea behind this is I am the green ball, moving around I’m about to go and the AI bots that are controlled elsewhere And we’re seeing that object detection happen in realtime And so I’ve tagged orange, pink, and red And blue is still moving around and making the decision to go untag its frozen teammates and start running away again, and then, as it loops around that top block, to go back and untag the other teammates DAVE ELLIOTT: And the white boxes are just obstacles that are [INAUDIBLE] scene, you try to avoid? NOAH NEGREY: Trying to add a challenging factor to the game DAVE ELLIOTT: All right So to summarize, we built this system this year It wasn’t a huge involvement in terms of time or resources But it was a really fun project See, understand, and react to the real world, leverage existing resources to be able to build this quickly The key things are you’re able to train a complex model in the cloud Leverage all the benefits of Cloud– since we’re wearing the T-shirts here– but then deploy them in the real world, deploy them, in this case, on just sort of fun Sphero robots to play these fun little games But again, it could be a robot in a warehouse, or a manufacturing plant, or in security, or a whole host of other use cases The other thing I’ll point out, in summary, is that all of these tools are available today TensorFlow and TensorFlow Lite are open source Cloud machine Learning Engine, GCS in the cloud are easy to use and relatively inexpensive to use You only pay for what you use In this case, part of it is the training And all of these examples, all these sample codes and blog posts are really readily available You can go out and use them today, build them, play with them over the holidays, and then modify them, and build something even better NOAH NEGREY: Yeah And as we pointed out, all of the resources,

specifically the source code on that link we put at the beginning, of the GitHub, Next 18 AI in Motion, has all the source code links to blog posts that go even more in-depth, as well as the other resources that we used to build this So definitely go and check it out DAVE ELLIOTT: OK, great So please stay tuned for live Q & A You can still enter those questions, and we’ll be back in less than a minute to answer your questions OK, welcome back It looks like we’ve got some questions from the live audience Let’s go ahead and start with the first one Can you explain why you use TensorFlow Lite? So going back to what we mentioned before, really it’s a question of, I think, first, why we want to use an embedded solution in a mobile solution That really goes down to, again, I think, a really unique and interesting emerging use case We’re still really early in the game As compute resources continue to become less expensive, as these mobile devices become more and more powerful, we’ll be able do more and more of the inference, more and more of the prediction on these devices And I think that’s the first question, is that there’s really interesting use cases that TensorFlow Lite can address And in terms of why we use TensorFlow Lite, I think it’s fair to say– well, Noah, you made that decision NOAH NEGREY: Yeah, so I think it goes back to the– you can run TensorFlow and TensorFlow Lite on Android But the key difference with TensorFlow Lite is you get that quantization step in when you’re speeding up and shrinking your model size, without losing much accuracy And so by doing that, you have your low-latency fast model without even having to go up to the cloud So you’ve got your training model on-device, so you’re not costing any data going out to the cloud, and you’re not incurring any network latency cost usage So that’s sort of the main key thing there, is that you’re keeping it locally on-device for faster things, and you’re not costing your users data when they’re running it DAVE ELLIOTT: I think it’s fair I would also add to that that, generally speaking, anything you do in TensorFlow now is becoming easier and easier because there’s such a robust ecosystem now There’s so much information out there that can help you do what we’ve done, which is deliver this pretty quickly Second thing– what’s the biggest obstacles in doing this? Where did you hit your roadblocks? Ooh, we hit a fair number of roadblocks NOAH NEGREY: Well, I’d say the first biggest obstacle with machine learning in general is sort of being overwhelmed by either the task and the problem or the amount of resources out there, and trying to sift through And so I think that’s the initial burden, of like, oh, wow, this is a huge space It’s daunting At least it was for me when we started this But going out, and sort of digging through, and finding these different resources that we have, you can see, oh, wow, they do a really good job of explaining, walking you through the steps, and so forth I think the biggest roadblock we really hit was actually with our commander model And that was actually– when we deploy it, we have our simulations, they look great, but some of the behaviors that we would then deploy, we learned, oh dear, this doesn’t really work in the real world, or it learns a bad behavior So if our reward system was poor in our initial commander

models, the Sphero robot sort of just learned, hey, if I go move to the wall, and just stay there, and don’t move, that’s the best decision I can make, because our reward system [INTERPOSING VOICES] giving it back We gave it a poor reward system where that was its best score it could determine And so by modifying that, and moving to that target and point thing, with the mountain falling away and gravity, we got a better simulation out, which then also worked really well in the real world DAVE ELLIOTT: And I think the an interesting thing on this is I think some of the biggest obstacles that you might hit are, ironically, the physical part of it– the mat, the walls, the camera being suspended onto the arena None of it is terribly difficult, but it does take some effort to be able to set up an environment like that Initially, we used 2 by 4’s and LEGO bricks to suspend the camera in place And ultimately, what you saw in the images earlier were things that were put together by an interactive professional agency, hardened so they could go around to trade shows But that was actually an interesting– the physical development of it was one of the interesting challenges NOAH NEGREY: But the videos we’re done with our prototype– 2 by 4’s, we went to Home Depot, bought some plywood, put it together, done DAVE ELLIOTT: $50 or less worth of equipment All right, can you can apply this to other common scenarios aside from gaming? Yes, gaming is fun But as I mentioned a few times, to me, one of the really most evident use cases is in a warehouse, in inventory management You can set up simple cameras to be able to look at a warehouse, be able to track things as they move around, to be able to see what’s happening in that warehouse And then to be able to understand that, oh, we’re short on this, we’ve got an overabundance of that, and then to make decisions based upon what it sees to be able to take corrective action And then, ultimately, through maybe warehouse robotics, to be able to take that action, to go and automatically order a resupply, have it delivered, and resupply those shelves That’s a simple real world example NOAH NEGREY: On top of that, you can also do some defect detection So if you have an assembly line sort of going by, and you’re checking your products for defects, you can go and monitor that as well with this same technology We’ve done something else with a similar thing, using the same resources that we had for object detection to create something to detect if there was a error in, say, one of the objects moving by, and say, hey, this is a good object, this is a bad one And then you can use a second model to make a decision to react to that, and either pull that object aside, or make an alert and have one of your staff come up and grab it, move it, and do some manual inspection from there So there’s a lot of things you can do It’s just sort of coming up with the idea of what to use this object detection for DAVE ELLIOTT: Well, I talk a lot about machine learning And it’s really gotten to the stage where you need to think of the world as– you have systems now that can understand the world, whether it’s understanding somebody speak or understanding what’s happening in a video or in physical pictures, it can see the world and understand what’s happening in the world And once you realize that you can take that and put it on disconnected mobile devices, I think the litany of things you can do really becomes interesting And again, we’re really early in this, and this is just an easy way of showing how this can be done I did want to touch on the last question, which is, can you review the GitHub repositories? So this should be in the bottom of what you’re looking at on your screen I think we included just about all of the links, including the Google Cloud Platform AI Next– NOAH NEGREY: The repository DAVE ELLIOTT: –link NOAH NEGREY: That’s the main one that will at least start getting you out to the other ones that we used DAVE ELLIOTT: So hopefully we’ve answered most of your questions If we didn’t, we’ve got people who are directly answering them online So at this stage, we’re going to go ahead and wrap up Stay tuned for the next session, Building Chatbots with Dialogflow And I’ve been looking forward to this, because we’re out in Mountain View But this one is live from New York with Whitney, Building Chatbots with Dialogflow Thanks for joining us [MUSIC PLAYING]