Many people work with a lot of text data. Maybe reading tens of thousands of words per day in emails, articles, your own notes, and so on. Depending on the tasks you're working on, you may want to have AI help you read and process some of these large volumes of text we may have. For example, let's say you're planning out that dream vacation and you want to plan what restaurants to go to and what to eat in different cities. As maybe you've been thinking about this vacation for a long time. You might have saved lots of long blog articles about different destinations, and some of these blogs or other articles you've saved might describe restaurants and dishes in different cities. For the example we'll go to in this video, we'll say you want to incorporate some of this information into your food related itinerary for this upcoming trip. But in this large collection of thousand articles and blogs, there might be a lot of other related documents as well that just don't relate to restaurants or food or where to eat. If you're short on time and don't want to open up and read, every single file yourself one at a time to see if it's food related, you better use AI to look at all this text and help you process it. Let's go see how you can accomplish this in Python. We'll use journal entries from food critics that contain recommended restaurants and dishes, and see how you can use a large language model to process this data. Text data like emails, journal entries, and social media posts can vary significant in style and formatting. Some people will write using bullet points, while others will write a long paragraph. In the code we will go through in this lesson, you use Python in large language models, to extract relevant information from these text files, such as extracting the names of restaurants and the names of special dishes you might want to try out. "Pickled fish tacos". That sounds delicious. But prior to extracting restaurant names and names of dishes, you might want to double check if this text document is even relevant. For example, if you have an article about the history of a city, you know, that may not have any current restaurant names that we're trying to extract. Let's see how we can use Python code to do all this. So, please remember to run the code cells in order, I want to import the usual functions. And then similar to what you saw in the last lesson, let's load the file cape town texts and this prints the journal on Cape Town. Cape town is a beautiful city with iconic Table Mountain and also lots of charming beaches. Cape town dot text is a journal or written by a food critic and this focuses on the test kitchen, La Colombe and so on. Let's look at Tokyo dot text. I should travel to Tokyo a lot, but here's a list of inspiring restaurants in Tokyo. And this is formatted very differently than the Cape town documents, which had paragraphs of text. Now, before processing the text files like these, in order to extract restaurant and dish names, we'd like to use a large language model to check a document to see if it is relevant or not relevant to restaurants and their specialties. So here's a prompt. I'm going to then print LLM response to the prompt. And indeed the Tokyo article is relevant. And in fact, so is the Cape Town one. I happen to know that I have five files in this directory, which are to text Madrid, Rio de Janeiro, Sydney, and Tokyo. And so to write Python code to process all five files. Do you remember what's the coding structure we can use? You can use a for loop to iterate through multiple files. And so I'm going to set files this list of five files. This square bracket here is how we create a list in Python. So there's a square bracket my five file names. And then those square brackets. And then we'll say for file in the files open and read the file. Close the file as usual. And then we're going to create a prompt response with relevant or not relevant depending on whether or not the journal describes restaurants in their specialties. And then my f-string puts integer and no variable that we just read above. And then let's use the large language model model to get LLM response from a prompt. And I'm going to print this out, print a file name and then the right arrow, and get the LLM response and print of each file whether or not to this relevant to restaurants in the specialties. Let's run that and see what this says. Hope this works. All right so Cape Town is relevant Madrid text is not relevant, and the others are relevant. So if you have a collection of 100 or 200 documents, you can use a folder like this to get Python to automatically read all of them for you and just point out to you which are the ones relevant to food. Madrid dot text, let's check that one out. So let's read then print Madrid dot text and you see that So let's read then print Madrid dot text and you see that this is a really nice article on Madrid, but isn't focus on food specifically, which is why the large language model has said Madrid dot text is not relevant. And by the way, I would encourage you also to try different prompts. So for example, you can say "respond with yes or no". Well, the journal describes restaurants and food dishes. There are multiple ways to do this. Something more advanced, would be to have a printout only the file names of the ones that are relevant. That would be pretty advanced exercise if you want to give it a shot. You can also ask the AI chatbot companion for help. And just for fun, if you're curious about the name of what we just did, I'm using AI to determine whether different texts are relevant or not relevant. Does this task have a specific name? So what we just did here was build a text classification system. So I encourage you to play with this code and modify it to do different things. Do take the code and try out different prompts and see what different results you get. But I hope you see how a snippet of code like this can read in this case five or maybe even more documents for you to quickly help you identify the most relevant ones. The nice thing about coding is you could tell the computer to do any of a huge range of things. So, I hope you had fun with this. And in the next lesson, we'll see how to take one relevant article and extract key information, such as the names of the restaurants and dishes from that file. I'll see you in the next lesson.