In this video, you'll learn how to control the model and what it generates. For many, this is the most exciting piece because you get to tell the model what you want and it gets to imagine it for you. When it comes to controlling these models, we actually want to use embeddings. And what embeddings are, which we looked at a little bit in previous videos of a time embedding and a context embedding, what embeddings are is they're vectors, they're numbers that are able to capture meaning. And here it's capturing the meaning of this sentence, or this joke perhaps, through diffusion models. "Brownians often bump into each other". So it encodes that into this embedding, which is this set of numbers in a vector. And what's special about embeddings is because they can capture the semantic meaning, text with similar content will have similar vectors. And one of the kind of magical things about embeddings is you can almost do this vector arithmetic with it. So "Paris - France + England = the London" embedding, for example. Okay, so how do these embeddings actually become context to the model during training? Well, here you have an avocado image, which you want the neural network And you also have a caption for it, a ripe avocado. And you can actually pass that through, get an embedding and input that into the neural network to then predict the noise that was added to this avocado image and then compute the loss and do the same thing as before. And you could do this across a lot of different images with captions. So here is a comfy armchair. You can pass it through an embedding, pass it into the model and have that be part of training. Now, the magic of this section is that while you were able to scrape these images of avocados and armchairs off the internet with those captions, you're able to, at sample time, be able to generate things that the model has never seen before. And that could be an "avocado armchair". And the magic of this is because you can embed the words "avocado armchair" into this embedding that has, you know, a bit of avocado in there, a bit of armchair in there, put that through the neural network, have it predict noise, subtract that noise out and get, lo and behold, an "avocado armchair". So more broadly context is a vector that can control generation. Context can be, just as we have seen now, the text embeddings of that "avocado armchair" that's very long. But context doesn't have to be that long. Context can also be different categories that are five in length, you know, five different dimensions, such as having a hero or being a non-hero like these objects of a fireball and a mushroom. It could be food items, you know, apple, orange, watermelon. It could be spells and weapons like this bow and arrow or this candle. And finally, it could be whether these sprites are side-facing or not. So now let's take a look at adding context to your model in the next lab. Onto our lab, we can just run the setup here for all these things, just setting up the same things as before. And then down here in context, I want to instantiate our neural network again. And again, we're not training, but I'm going to call out a few places where we do add the context. So when we do load the data here, we now iterate through both the data point and the context vector associated with it. And the context that we do have are these one-hot encoded vectors of hero, non-hero, food, spells, and weapons, and side-facing. We create a context mask, and what's important here is that actually, with some randomness, we completely mask out the context, so that the model is able to learn generally what a sprite is as well. It's pretty common for diffusion models. And then we add context when we call the neural network right here. So let's load a checkpoint where we did train the model with context. So just loading that model here, running that. Running our sampling code again, so we have that for this notebook. And here you can see that when you run this, this is actually choosing completely random contexts. Right here, completely randomly. You can see the different types of outputs of. You can see the different types of outputs of objects and people. And now controlling it a bit, you can actually define it here. So here I just defined, you know, hero, a couple heroes, you know, the first two. So these two are heroes. The next two are side facing, so it's one hot with this last value here for side facing. The next two are non-heroes, so kind of beasts. They look very blobby here. And the last two are food items. So this kind of looks like an apple and this kind of looks like a pear. And now getting into the "avocado armchair" vibe, we can actually mix and match these a bit. So while we trained it on one hot encoded vectors, we can also provide it with these float numbers between 0 and 1 to get a mix of things. So here for the second one here, it is a hero and partially food. And so now it looks like a potato man. The third one's also a little bit quirky. It is part food and part spell. So it kind of looks like this potion. But yeah, you can play with these yourself. You can try to put in something contradictory, like it's supposed to be a hero, but also side facing, like both front facing and side facing. So this is good fun. Feel free to stop, pause, and play with this a few times and start changing these values up. So now that you can create all these samples, control them, in the next video, you'll explore speeding up the sampling process so that you don't have to wait so long to see these amazing samples.