In the first lesson, I'll show you how to instrument weights and biases in your machine learning training code. As we train machine learning models, many things can go wrong. "wandb" will help us monitor and debug and evaluate our pipelines. Let's dive in! With just a few lines of code, you can monitor your metrics, CPU, and GPU usage in real time. You can version control your code, reproduce model checkpoints, and visualize your predictions in a centralized, interactive dashboard. Our users evaluate models, discuss bugs, and demonstrate progress with configurable reports. By the end of this course, you'll be able to do this too. Let's start by learning how to incorporate weights and biases into your training process. First, you'll have to install the Python library using the command pip install "wandb". After installation, we need just a few lines of code. The first step is to import "wandb". Ideally, you already have your hyperparameters structured in an object, such as a Python dictionary. Otherwise, put them together in a config. You then need to initiate a "wandb" run. A run in Weights & Biases means a unit of Generally, a run corresponds to a machine learning experiment. You begin a run by calling "wandb.init", passing the project name and your config object. You carry on with your model training code, and when you reach a point where there are certain metrics you want to track and visualize, you log them using "wandb.log". If you are using a notebook, it is recommended to call "wandb.finish" at the end. Now, let's check this out in a notebook. In this training script, we will train a sprite classification model. A sprite is a small 16 by 16 pixel image, and our goal is to categorize the sprite into one of five classes. Hero, non-hero, food, or spell, and side-facing, which is not pictured. Now let's start with all the imports, and make sure that "wandb" is in there. Here we define a simple classifier model with two linear layers. We'll store the parameters we want to track in a simple namespace, which is similar to a Python dictionary. Here we have our training function. Let's modify it to add "wandb" logging. First, we'll need to call "wandb.init" and pass our project name and our config object. Once we have the metrics, we will log them to Weights and Biases, which is done with "wandb.log". We will also record our validation metrics at the end of each epoch. For the sake of clarity. Today, we may explicitly end our Weights and Biases run by calling "wandb.finish". We don't need to change anything in the validation function, so I'll just run this cell. In this course, we will use "wandb" Cloud Platform, and that means we need to log in. There's also an option to install "wandb" locally, but that is a bit more complex, so we will not cover that in this course, "wandb" is free for personal and academic use, and we encourage you to sign up. This way, you can keep the results of your experiment tracking. But you can also run this code in anonymous mode. So now I've run the login, and I'm using my personal account and pasting in my API key. You can pull that up at "wandb.ai/authorize". So I'm pressing Enter, and that's appending the key. That means I'm logged in for this notebook. Finally, let's train the model. Let's run our training code and see the progress. Now, the data is being logged to the "wandb" server, where we'll save your results. So you can see the links that are printed out here when you execute the run, and there are a few different links. So first, syncing this run, this is the individual experiment that you've just tracked. You can also pull up the project page. This will compare across different runs that you're tracking. Let's follow that link and check our workspace. Now I'm here in the project page workspace. Now this is the data that I can see from that that training run we just ran in the notebook and these charts are small so I can expand them out to make them bigger. Now this looks good the training loss is going down over time and I'm checking my validation metrics as well so looking at validation accuracy it looks like that's around 51% that's not so great so if I want to keep improving this model, it looks like I can keep training for longer, or increase the learning rate. So let's go back to the notebook and make those updates. Now that the results have been logged to "wandb", I can go back to that same project page and see the latest result compared to the previous baseline. Great! Here's that new run that just appeared in our workspace. And you can see that this red run, Genial Elevator, actually is doing better than the previous run. That's a great sign, this is what we want to see. Now, I'll go back into the notebook and try a few more things, and I encourage you to try different configs as well. So how can you tweak the hyperparameters to get better performance from this model? Take a minute to try that, and we'll come back when I have a few more experiments. Back in the project page, I'm comparing the results of the different hyperparameters we set in those runs. So you can see each of these have a different training curve, and when I hover over it, it's highlighted in the sidebar. So I can see this run, "dry-cherry-4", the purple run, seems to be doing the best. It has the lowest training loss, and it looks like it's also getting the best validation accuracy at 99. Another way to compare experiments is by using the runs table. This shows those same runs in a tabular format, so I can look at the metrics and hyperparameters side by side. Here, I can see when I changed the dropout, epics, and learning rate across my different runs. A specific metric of interest might be accuracy. To find it, I can go to the Columns section and search for accuracy. When you're logging a whole lot of metrics, this can be especially helpful. Here, I can click the Pin button, and when I close this window, that metric will appear here on the side. Now, when I collapse this view, I can see validation accuracy in the sidebar alongside my other metrics. In this example, I have one run, Generous Salad, that has pretty bad validation accuracy. I can use the filter button to hide any runs below a specific threshold. Here I'm adding a filter, typing accuracy, and then selecting greater than or equal to, and typing in 0.9. Now that filter has just been applied, and you can see I only have three runs that match that criteria. So this helps me narrow in on just my best runs. Since higher validation accuracy is better. I'm going to sort in descending order to place the most successful runs at the top. Here I'm selecting sort and typing accuracy and picking that one. And now it's in descending order, so my best run dry cherry is at the top. This is especially useful when you have hundreds or thousands of runs. Now I'm going to select that best run and see the overview of it. Here in the detail view for my best run. I can see some context about how it was created. So "wandb" automatically picks up the Git repo, so I can easily get back to the code that was used to train this model. It also gets the hash of the latest Git commit. So, I know exactly the state of that repo when this run was created. But, realistically, often I'm making little tweaks in that notebook and don't remember to always commit my changes. So, what do you do if you have uncommitted changes? Fortunately, "wandb" captures the diff patch. So, opening the Files tab, I can see the diff patch has been saved with any uncommitted changes. So, now if I go back to that overview, you,. I know that I can easily get back to the exact state of the code by pulling this git commit and applying the patch. So that makes this run more reproducible. If I sent this to someone, I could also easily communicate with them about the settings I chose for this model. Here in the config, I'm capturing batch size, dropout epics, learning rate, all of the settings that we had in that notebook. And this is an easy summarized format. This information can be helpful when things go well, but it's also very valuable when things go wrong. So we can use this context for debugging, understanding what code was used in an experiment, what the environment was, the dataset, etc. Now in the next lesson, we will look at training a generative AI model, and how these tools can be helpful there.