database. Let's see how you'll now integrate one into your retrieval flow. So in this lesson, we'll continue walking through the steps of a flow to set up Retrieval Augmented Generation. As a refresher, some steps we previously outlined. First, we'll load the documents from a source, like Andrew Ang's CS29 PDF transcript. We'll split those chunks small enough to fit into an LLM's context window and avoid distraction with our text splitters. And now we're on to embedding those chunks and storing them in a vector store to allow for a later retrieval based on an input query. And to do that we'll use what's called a text embedding model. We use OpenAI's hosted model in this demo, but again you can swap them out for any embedding provider of your choice. A vector store is a specialized type of database with natural language search capabilities. And when the user comes in with a query, we'll search the vector store for an embedding, which we'll get to in just a moment, similar to the query that they're asking, and then return relevant chunks for the LLM's final generation. And we'll show off how to embed our previously split document chunks so that we can take advantage of these capabilities. So the first step of adding documents to the Vector Store is called ingestion. And again, it uses an embeddings model, which is a specialized type of machine learning model, to convert document contents into a representation called a vector, which our Vector Store can then search over. And we'll be using OpenAI's hosted embeddings, so we'll need to import our environment variables to get our key. And for the demo, we'll use an in-memory Vector Store, which you wouldn't want to use in production, but you can easily swap out for a provider of your choice. So let's import our embeddings model and instantiate it like this. And then let's try embedding a small query to see what it looks like. Cool. The result is a vector in the form of an array of numbers. And you can think of these generated numbers as capturing various abstract features of the embedded text that we can later search over to find closely related chunks, and to show what this search process looks like concretely, we can use a JavaScript library to compare the similarity between some different embeddings. So let's take one vector here that we'll create using by embedding this query what our vector is useful for in machine learning, and then we'll take an unrelated query that we'll search against to show the score. So now let's calculate the score between them, and you can see we get a score of 0.69621. So now let's try this with a more closely related vector so we can see the difference in score. And the metric we're using here is called Cosine Similarity, and is one of many ways that you can compare similarity between two vectors. So in this case, we'll use a more similar vector. Vector is a representation of information. And let's see how that scores against our initial vector. What are vectors used for in machine learning? So we'll use the same similarity cosine function, and this time we'll use similar vector. And we can see the result is significantly higher than our dissimilar vector comparison, since both texts contain similar information. Great. So we'll prepare our documents using the techniques covered in the previous lesson. And let's set the chunk size small for demo purposes. So you can see here we're using the same PDF loader that we used previously with the same peer dependency, and then a text splitter which we imported in the previous lesson, a chunk size of 128. And let's split some documents. Great. Now let's initialize our vector store. And again, we're going to use a in-memory vector store for the demo, but in production, you can use whichever implementation you'd like. And we'll initialize it with our embeddings model already built in here. And you'll notice we're passing the embeddings model in on initialization. And that's because the LangChain Vector Store implementation will use this embeddings model to generate those arrays of numbers as vectors that you saw earlier for each added documents contents. Now let's add our split documents to the Vector Store. Oh, I didn't run this. Let's run that. And now let's add the documents. And this time it should work. There we go. We've got a populated searchable VectorStore. Because LangChain VectorStore is exposed to an interface for searching directly with a natural language query, we can immediately try it and see what results we get. So let's use the similarity search method with the query, what is Deeplearning? We'll return four documents and then we'll make the page content of each document a little bit easier to read using a simple formatting function. So what we would expect to see here is what we get, which are four small chunks with content related to Deeplearning. Machine learning, learning algorithm, machine learning, and machine learning. So pretty reasonable. Now let's talk about retrievers. So we've just shown how to return documents from a vector store using similarity search directly. But that's actually just one type of way to fetch data for an LLM of many. And LangChain encapsulates this distinction with a broader retriever abstraction that returns documents related to our given natural language query. Conveniently, we can instantiate a retriever from an existing vector store with a simple function call as retriever, just like that. And one nice trait of retrievers is that unlike VectorStores, they implement the invoke method and are themselves expression language runnables, which we learned about in the first lesson and can therefore be chained with other modules. So to show this, we'll run this Retriever we just instantiated with the same query as before with the invoke method. And you'll see the results we get are the same documents around Deeplearning, but this time we'll be able to use it within a chain with other modules like LLMs, after parsers and prompts, which will use great effect in the next lesson on constructing a retrieval chain.