sparse and dense search. Covering techniques for implementing both and discuss the advantages and disadvantages of each. We'll then introduce the practical and popular methodology of combining this using hybrid search. Hybrid search allows you to make the most of both search techniques and fusing the return rank results. Let's roll. So, let's see what's the difference between dense search and sparse search and why would you want one over the other. So, dense search uses vector embeddings representation of the data to perform the search. So, it relies on the meaning of the data in order to perform that query. So for example, if we look for baby dogs maybe we can get back information and content on puppies. However, this has its limitations. For example, if the model that we are using was trained on a completely different domain, the accuracy of our queries would be rather poor. It's very much like if you went to a doctor and asked them how to fix a car engine. Well, the doctor probably wouldn't have a good answer for you. Another example is when we're dealing with stuff like serial numbers, like seemingly random strings of text. And in this case, also, there isn't a lot of meaning into codes like BB43300, right? Like if you ask the semantic engine for finding content with that, you will get high quality results back. This is why we need to actually go into a different direction for situations like this and try to go for keyword search, also known as sparse search. Sparse search is a way that allows you to utilize the keyword matching across all of your content. One example could be, hey, we could use bag of words. And the idea behind bag of words is like for every passage of text that you have in your data, what you can do is grab all the words and then keep adding and expanding to your table of available words, just like you see below. So in this case, we can see that like maybe extremely, and cute appeared once in this sentence, and then word eat appears twice. So, that's how we can construct that for sparse embedding for this object. And as I mentioned, this is called sparse embedding because if you have across all of your data, are so many words, actually, the vector that will represent that data will have a lot of slots where you could count each individual word. But in reality, you would be catching maybe 1% of available words. So, you'd have a lot of zeros in your data. A good example of a keyword-based algorithm is Best Matching 25, also known as BM25. And it actually performs really well when it comes to searching across many, many keywords. And the idea behind it is that it counts the number of words within the phrase that you are passing in and then those that appear more often are weighted as like less important when the match occurs but words that are rare if we match on that the score is a lot higher. And like you see this example here the sentence that we provided at the bottom will result in quite a lot of zeros that's why we call it sparse vector search. But we don't have to choose between one or the other. This is where hybrid search comes in. Hybrid search is basically a way of running both sparse and dense vector in one query. And then for each, we'll get a different score and different result. And then, we can combine those scores into one score and then rerun all of our results and then return it back to the user. So, let's see how all of this works in action in our code. So, we are going to use exactly same data as in the previous lesson. So, I'm not going to over it too much. Let's just quickly load the data, create a new instance of Weaviate. Then, we can create our collection and then import our data and we are good to go. So let's start with a query that you're already familiar with and you use the Weavnir text and then we'll search for concepts on animal. And then, we can see that semantically we can match things like mammals and crocodile and well, exact match on animal itself. Now, let's try to run the same query, but with a keyword search. So, I am going to add with BM 25, and our query again is animal. And let's see what kind of results we get. And you see this time we only get one object back, which exactly matches on the animal. And now, we're getting to the juicy part where we can actually execute the hybrid search. So, let's go with hybrid and our query again is animal and we have this special parameter called alpha which basically tells which one to favor with alpha closer to one means that we are favoring the scores from the vector search from the dense vector search while when alpha is closer to zero then we are favoring keyword search. So, let's run this and then now we can see that we're getting very similar results as before, but the interesting thing is that the object that has the keyword animal inside is ranked as a top result which brings it up all the way to the top into our attention and we could return it to our users. Now, let's grab this again and then try one more time but with a different alpha. So, if I set this to zero, then we should get only the responses that are helpful based on the keyword search. So, none of the vector search were returned. And if we try the same thing with alpha, let's say one, this will basically be a pure dense vector search. And voila, this is the power between the dense and sparse vector search and also combining of both of them through hybrid search. And in the next lesson, we'll dive into multilingual search and also Retrieval Augmented Generation.