database, and discuss how you can use it to perform semantic search, and how it supports CRUD operations, which stands for Create, Read, Update, and Delete. You'll also inspect the objects and vectors that are stored in a database. This lesson will provide you with the fundamentals of how to get started with a vector database and even some advanced topics such as performing filtered search. Let's have some fun. In this project, we'll use this sample dataset, which contains a set of Jeopardy questions. So, the idea is that we'll have something like a category, question and an answer, and we'll load it into a vector database and perform semantic search queries on it. So, just like in the previous lesson, we'll set up an embedded instance of Weaviate, but one difference in here is that we will use OpenAI to generate our vector embeddings and for this we need to load an OpenAI API key. If you are running this project at home in your own environment you probably have to replace this with your own API key, but for the purposes of this tutorial you can just keep it as it is. And don't worry if you see this kind of warnings, this is purely informational, everything works just fine. And if you're curious what sort of things are available inside an embedded instance, basically, Weaviate offers this modular system which allows you to use something like generative search with OpenAI, or you could run textual vectorization with Cohere, HangingPlace or OpenAI. Which is kind of like the power behind it, because it allows you to skip the manual vectorization and let the database take care of it for you. And this is what I want to show you in the next few steps. So, just like in the previous lesson, we need to start with creating a new collection. So, we'll call it a question. And this time we'll use Text2Vec OpenAI Vectorizer. And this is actually a pretty powerful module, which allows you to automatically generate vector embeddings as you import the data, and at the time of each query, the vector database will grab the necessary input and then send it to OpenAI to vectorize. And just as a reminder let's print one of the data objects so that we know the data structure for it. And then, we can take that data object and then import it into our questions collection. And this is what we're gonna do here. We are going to import the data in batches of five and then basically for each object we will say hey we're importing this question, construct our objects with the answer question in category and then pass it into the database with the collection name question. You notice here that we are not passing a vector embedding this time because that's basically what the Text2Vec OpenAI module is supposed to do, and it will generate a vector embedding for every object. So, if you run this the vectorization is complete. And to verify we can run this quick aggregate query on the question collection, and we can see that we do have 10 objects inside. And now, what we can do is actually maybe grab one of the objects from our question collection. Let's see what category, question and answer it has. But more importantly, let's have a look what vector embeddings was generated for that specific object. So, if we run this, we can see a whole vector embedding, which is pretty long. It should be about 1500-dimensional embedding. And now, let's try to run a vector query using SemanticSearch. We'll use with near text operator, and we'll pass in our query as concepts and the query itself is biology so that's what we're looking for. And then, to add some extra info. Let's also display additional property, which is the distance. And then, if we run this query, we should get two objects that match the query for biology. And this is our result. And since the other model uses a cosine distance, then smaller numbers indicate better matches. So in this case, 0.19 and 0.2, that actually indicates a pretty strong match to our biology query. We can also run a query to return all the objects that we have in the database and then kind of like look at all the available distances that we get here so you can see that as we go down the distances increase since we use a cosine distance metric that kind of means that the worst matches are at the bottom and the best ones are at the top. Now, let's try to run the same query again, but the thing is that we don't always know how many objects are the best matches and maybe one thing that we could do is basically say like hey I accept anything within a specific distance so this time I can say hey this my distance is should be at least 0.24 and anything that is above that distance should be rejected and this is a really good method of kind of say like. I have certain requirements for the quality of my results and anything that goes beyond that should be ignored. So, like you can see here the final result was cut off at 0.23. And since we are working with a vector database, that means that we can also perform various CRUD operations, like create, read, update or delete. So, to create a single object, all we need to do is call client data object create, and then we can pass in the data object inside, and then provide the collection name that we'll insert it into. And again, the Text2Vec OpenAI module will generate the vector embedding for this object. So, let's add the object. Now we can print its UUID. Now, let's see a read example to read the object that we just created in the previous block, and we are going to grab it by this object ID and then if we print it this is our object. And if you are curious to see what is the vector embedding that was generated for it all we have to do is just add this with vector true and then running that will give us the object with all the information and its vector embedding. And now, let's grab that object and maybe update it. So previously the answer was set to just Italy but let's set it to Florence in Italy. So, if we run this the object will get updated and then we could grab it again by its ID and we can see the answer indeed got updated. And finally, we go to the stage where we just want to delete our sample object so in this case what we want to do first is check how many objects we have before. Then, we can delete the object based on its ID and then finally we print the aggregate just to verify that we have one object left. Before we had 11, now back to 10. And this concludes this lesson. In here, you learn how to use a vector database to automatically vectorize all your data with OpenAI and also use the same mechanism to vectorize your queries and perform various searches including vector search and filtered search. And we also went over how to use various CRUD operations, so that you could maintain your data as you go throughout the life cycle of your applications. And in the next lesson, we'll introduce the concept of sparse and dense vector, but we'll also look at the hybrid search, which allows us to combine both of those methods to provide better results.