Welcome to Introduction to On-Device AI, built a partnership with Qualcomm and taught by Krishna Sridhar. A modest smartphone may have 10 to 30 teraflops of compute power. When you take a picture, that smartphone may be running dozens of AI models simultaneously for real time semantic segmentation and scene understanding. In this course, you learn how to create AI applications that run on-device. These techniques are applicable not only to making your app potentially run on to about 7 billion smartphones out there, but also potentially billions of other devices, including cameras, robots, drones, VR headsets, and many more. Despite the differences in hardware and operating systems among all of these devices, the principles of the key technical steps for deploying on-device are actually quite similar for many of these devices. Given them although they've already trained, perhaps in the cloud, to deploy it on-device, the first step is model conversion. Which means converting your model from, say, a PyTorch or TensorFlow framework into a format compatible with the on-device runtime. In this step, the model is frozen into a neural network graph, which is then converted into an executable for the device. Devices such as smartphones, and edge devices often contain a mix of processing units, including CPUs, GPUs, and neural processing units or NPUs. Knowing the exact devices your Apple runs on allows optimizations they can dramatically enhance performance sometimes making all those run up to ten times faster. You'll also learn tools to help you accomplish this across many different devices. And this is important because there are a lot of different smartphone brands and models, with your mobile app potentially running on maybe over 300 different smartphone types. It's then also important to ensure that your model performs consistently across these many different devices. This might mean validating the on-device numerical correctness across a broad range of devices to prevent cases where a model operates correctly on one device, but not on another, due to hardware differences. You learn how to do all this. And then lastly, quantization is also a common step of running on-device models. As you see, in the real time segmentation app in this course, quantization can make your app run several times faster, while resulting much smaller model size. In our case, about four times faster with also four times smaller model size. Our instructor, Krishna Sridhar, is senior director of engineering at Qualcomm. He's been doing on-device AI for about a decade and has built critical deployment on-device infrastructure that might well be running on your smartphone right now. Krishna has directly helped deploy over a thousand models on devices and over 100,000 applications have used the tech he and his team have built. Thanks, Andrew. In this course, you'll first learn how to deploy an on-device model in order to reduce latency, improve privacy, as well as improve efficiency. You will deploy your first model on-device with just a few lines of code. The model will do real-time segmentation from your camera stream. You will learn four key concepts as part of this course. The first one is how to capture your model as a graph that can be portable and runnable on a device. The process of compilation of that graph for a specific device. The hardware acceleration of that model in order to run efficiently on-device, as well as the process of validating that particular model for numerical correctness on-device. Finally, you will learn how to quantize a model so you can improve the performance by nearly 4x while also reducing the footprint of that particular model. Finally, we will integrate this particular model in an Android application that you can play around with. Many people have learned to create this course. I'd like to thank from Qualcomm, Kory Watson, Gustav Larsson and Siddhika Nevrekar. Also Esmaeil Gargari, Geoff Ladwig from DeepLearning.AI also contributed to this course. On-device deployment of AI models is taking off and opens up a lot of exciting capabilities to build this API systems. Let's go on to the next video to get started.