Welcome to this short course Building multi-modal Search and RAG both in partnership with Weviate. RAG or Retrieval Augmented Generation Systems provide an LLM with context that includes information about your proprietary data and ask the LLM to use that context when generating this response. A common way to build RAG applications use the active database to store your text stock with embeddings. Then, given the query, you retrieve relevant information from the vector database and add that as text context to your prompt. But what if the context you want includes an image of a presentation, or as an audio clip, or maybe even a video? This course teaches you the technical details behind implementing RAG with such multi-modal data. The first step is to find a way to compute embeddings so that data on related topics, embedd is similarly independently of modality. For example, a text about a lion and image that shows a lion and the video or audio of a lion roaring should be embedded close to each other so that the query of lions can retrieve all of this data. In other words, we want the embedding of concepts to be moved modality independent. You learn how this is done through a process called contrastive learning, in the next video. After such a multi-modal retrieval model, You're going to use it to retrieve the context related to a user's query, so you can now build a multi-modal search app where the image of a lion can be used to retrieve video, audio, and text related to that image. Now, if your generative model that supports multi-modal inputs, you can use to retrieve results as a context and provide it to the model to ask it to respond to the query based on the relevant multi-modal contextual information. I am thrilled that instructor for this course, Sebastian, is here to explain how multi-modal apps work under the hood. Sebastian is head of developer relations at Weaviate, and he's an expert in vector databases who's worked on developer relations for over a decade. In fact, his full time job is to help developers like you build successfully with vector databases. Thanks, Andrew. I'm really excited to work with you on this course. So in this course, you first learn how to teach a computer the concept of understanding multi-modal data. Then, you build a text to any as well as any to any search. I need the next step, you learn how to combine language and multi-modal models into language visual models, that understand images as well as text. Next, you focus on multi-modal RAG. By mixing multi-modal search together with multi-modal generation and reasoning. And as a final step, you will learn how multi-modality is used in industry like implementing different real life examples that include analyzing invoices and flowcharts. Many people have worked to create this course. I'd like to thank Zain Hasan from Weaviate, as was Esmaeil Gargari from DeepLearning.AI who contributed to this course. So that. lots of exciting topics. Let's go on to the next video to get started.