for LLM Applications, built in partnership with WideApps. When building an LLM-powered app, you often want to use metrics to ensure it can handle inappropriate outputs and to ensure the quality and safety of its outputs. What I've seen in many countries is that the LLM app proof-of-concept can be quick to build. Maybe you can throw something together in days or weeks, but the process of then understanding if it's safe to deploy, then Hoseup is getting into actual usage. This short course goes over the most common ways an LLM application can go wrong. You hear about prompt injections, hallucinations, data leakage, and toxicity, plus tools to mitigate the risk. I'm delighted to introduce the instructor for this course, Bernice Rundin, who is Senior Data Scientist at YNABS. Bernice has worked for the last six years on evaluation and metrics for AI systems, and I've had the pleasure of collaborating with her a few times already, since YLabs is a portfolio company of my team AI fund. Thanks, Andrew. I've been seeing a lot of LLM safety and quality issues across a lot of companies, and I'm excited to share best practices from the field. In this course, you learn to look for data leakage, where personal information such as names and email addresses might appear in either the input prompts or the output responses of Yellow. You also learn to detect prompt injections, where a prompt attempts to get an LLM to output a response that it is supposed to refuse, for example, reviewing instructions for causing harm. One such method that you use is an implicit toxicity model. Implicit toxicity models go beyond identifying toxic words and can detect more subtle forms of toxicity, where the words may sound innocent, but the meaning is not. You also identify when responses are more likely to be hallucinations using the self-check GPT framework, which trumps NLM multiple times, to check for consistency to determine if it's really confident about something it's saying. Bernice will go through how to detect, measure, and mitigate these issues using open-source Python packages, lang codes, and ylogs, as well as some HuggingFix tools. Practitioners and researchers have been experimenting with countless LLM applications that could benefit society, but measuring how well the system works is a necessary step to the development process. In fact, even after a system is deployed, ensuring quality and safety of your AI application will continue to be an ongoing process. Ensuring your system works long-term requires techniques that work at scale, and in this course, you'll see some of these techniques that will make LLM-powered apps safer. Many people have worked to make this course possible. I'd like to thank, on the YLAB side, Maria Karayanova, Kelsey O'Neill, Felipe Adachi, and Alicia Bicznek. From DeepLearning.ai, Eli Hsu and Diala Ezzedine have also contributed to this course. The first lesson will give you a hands-on overview of methods and tools that you'll see throughout the course to help you detect data leakage, jailbreaks, and hallucinations. That sounds great. Let's go on to the next video and get started.