vector databases come from. You'll start off by looking at how neural networks can be used to represent and embed data as numbers, but also you'll be building an autoencoder architecture to embed images into vectors. You'll then go over what it means for data objects to be similar or dissimilar, and how to quantify this using the vector representations of data. Let's get into it. Here's the autoencoder, and to illustrate how it works, we will use the MNIST handwritten digits dataset, which will then, if you pass a digit, an image of a digit like this, which by the way has 28 by 28 pixels in dimension, which makes it for 784 dimensions, and then if you run it through this, the encoder will compress it, and then decoder will decompress it. And then, we'll end up with another image. And you can already see that the two images, they don't exactly match. This is why we have to run this through multiple training sets. Each time we run it, the internal weights will get adjusted. And then each time, we'll get a better and better match until eventually the model has been trained. And we can be quite happy with the results of coming in and out. The important thing to notice here is that the output is generated using only the vector in the middle so that vector contains the meaning of that image and we call that the embedding. We'll go and code it in a minute but this is how the model looks on the inside. So in here, we can see a group of dense layers and then you can see that as we pass the image through dense layers it gets compressed through 256 and 128 dimensions until we reach the two dimensions and then likewise for the decoder which will take the vector embedding from two dimensions into 128, 256 until we reach the final output. And the reason we chose the vector embedding in the middle to have two dimensions is purely to make it easier for us to visualize it during the lessons but in fact very often vector embeddings have way more dimensions than just two, often reaching a thousand or more. And here's a nice example of how we can take any kind of data, so we could take an image and then convert it into machine-understandable vector embedding, or we could take a whole piece of text and also generate a vector embedding from it. And I cannot stress enough, but basically, the vector embedding captures the meaning of the underlying data. And you can think of vector embeddings as machine-understandable format of the data. Cool. So, let's go and see how it all works in code. All right, so as the first step, we need to load in some libraries, and we'll be using TensorFlow to make it all work. Very nice! Just like I mentioned earlier, we'll be using MNIST to load our data set. So, let's go and do that. And what this will give us is a training set and a test set, which we can use later on. So next step, what we need to do is normalize this data. And in reality, what we are doing is basically taking this 28 by 28 image and then turning into a flattened structure. And then, if we print out the shapes of what we had before and after, we'll see that we had 60,000 training objects, which before were 28 by 28, and now they are 784 dimensions, and the same thing for the test data. And now, let's set up some parameters for our model. So, the batch size here is gonna be 100 objects, and it will run across 50 epochs. And then, the idea is that we'll start with 256 dimensions for a hidden state and the objective is to generate vector embeddings of two dimensions. Now, let's have a look at an example of an input image and you can see that this one looks very much like a zero which is great. Next, we need to construct a sampling function which will allow us to grab a number of images during the training phase. Next, we need to construct an encoder and just like I mentioned earlier on, we'll have two dense layers. The first one will have a dimension of 256. The next one will have dimensions of 128. Next, we need to normalize the data. Now, let's build a matching decoder. And in a very similar way, this time starting from two dimensions, we'll be moving into 128 and 256 dimensions over here. And eventually, we can create a decoder function. And this is our loss function to train the autoencoder, which is also known as variational autoencoder. And basically, the idea here that is optimized to train too much so that the inputs match outputs in a really good fashion. And now, that we have all the pieces, we can begin training that will run across 50 epochs and then each time training across 100 objects. So, this will take a few minutes, so we'll speed it up in post. Now, that the training is done, we can go and visualize our data. Let's start and build a flat encoder, and then we can add a piece of code to plot our vector embeddings onto a graph. And you can see in here that similar vectors are clustered together within the vector embedding space, and then just like you can see that zeros are close together here, nines are close together in here, and the whole space is actually demonstrated or displayed in a two-dimensional space. And those two dimensions are the two dimensions that we have inside the vector embedding. So now, we can get into a phase of comparing vector embeddings. So, let's give it a new section, and let's grab three different images. So, we'll grab one zero, which is this one. We can grab another zero. We have this one. And let's grab one image that represents digit number one. So, if we grab these three objects and we can call our function to generate vector embeddings. So, zero A, zero B, and one will contain the vector embedding values that we need. And if we print them, you can see the vectors as follows and you can already see that the two zeros are kind of similar to each other while the vector that represents digit one is actually quite different. We can also do something very similar with text embeddings. So, if we grab a sentence transformer like this and then grab a few sentences like this, we can generate vector embeddings for each of those sentences over here, which will have values like this. And to illustrate the shapes of each of those vectors, let's just run quickly embedding shape. And we can see that in here we have three vectors of 384 dimensions. So, let's try to represent those vectors in a visual way. So, I'm going to plot each of those vectors as a very nice barcode, or a series of barcodes. And you can already see that the first two vectors are kind of similar-ish to each other, while the third vector is quite different. And now, we are going to talk about distance metrics and how you can actually calculate the distance between different images or sentences or basically the vector embeddings that represent them. And we'll look at four different methods. We'll look at Euclidean distance, Manhattan distance, dot product, and cosine distance. So, let's start with the Euclidean distance, which basically calculates the shortest distance between two points. And here's the calculations for the Euclidean distance between 0A and 0B, which comes to around 0.6. There's also built-in methods inside NumPy, so we can calculate the same distance in this way. And if we calculate all of the distances together, we can see that the two zeros have the smallest distance between them, while 0A and 0B is quite far from the digit 1, and that's how we can actually mathematically prove that the two zeros are very similar. Now, let's have a look at the Manhattan distance, which is basically the distance between two points, as if you could only travel across one axis at a time. So, if we calculate again the distance between 0 and 0B, we'll get this distance. And then again, NumPy comes to the rescue and gives us a method to calculate in a very short way. And if we compare all the distances according to Manhattan distance, we can again see that 0A and B are very close, but they're quite far from digit 1. Next, let's have a look at dot product which basically measures the magnitude of the projection of one vector onto the other and then we'll straight go for NumPy to do the calculation and the distance between 0a and b is 3.6 and if we do the same thing across all three vectors you can see that the distance between 0a and b is 3.6 while the values for 0s compared to 1s are negative values. Unlike the previous examples where lower distance meant better match, when it comes to dot product, higher values actually mean a better match, while negative values usually mean that they are actually quite far away from each other. And finally, let's have a look at the cosine distance. And the idea behind cosine distance is that similar vectors will have a very small angle between each other. And we can calculate them as follows. So in the case, of the two zeros, we can see that the cosine value is actually very close to zero which basically indicates a very strong match. If we divide 0A by 0B we can see that the magnitude of the division is actually very close to each other that means that the two vectors travel along very similar direction. So, maybe let's encapsulate our cosine distance function into a function like this and if we grab all the distances again as before we can see that there is a very small angle between 0 and B but there's actually quite a high cosine value from zeros to one, which again proves our point. So now, let's go back to the example with our sentence embeddings. And the interesting thing for you to know is that actually of all the distance metrics, the dot product and cosine distance are actually quite commonly used in the field of natural language processing. So for example, we could try to compare all those vectors to each other using the dot product. And we can see that the dot product between the first two sentences is very high. And what this shows is that the first two sentences are very similar while the other sentences are actually not that similar. And if we do the same calculation with the cosine distance, we can again see that the angle indicated between the first sentences is actually fairly close while the other ones are further apart. But at the same time the first two sentences are not a perfect match either. And this concludes this lesson, and we'll take everything you've learned here to the next lesson where you learn how to search across multiple vectors.