Hi, you now have a broad overview of the type of tasks that you can achieve within the HuggingFace ecosystem. In most cases, for hosting demos and practical applications, it would be nice to have your application running without leaving your computer on. In other words, offload the whole compute requirements outside your local machine. In this lesson, you will leverage HuggingFace Spaces to deploy your demos and use them as an API. Let's get started. Hi, so welcome to this lab session about deploying ML models in the HuggingFace ecosystem using HuggingFace Spaces. So at the end of this lab, you will learn how to host a ML model as an API on HuggingFace Hub and call that API through a simple command. So for this lesson, we will deploy the Bleep model that you have covered on the multi-model lab session. Recall that the model has been fine-tuned on several multi-model tasks, making it possible to perform three different tasks, image captioning, visual question answering, and image text retrieval. In our case, we're going to focus on image captioning. So let's get started. So first of all, you need to create an account at the HuggingFace website, hf.co, and make sure to connect to the main website. Then you just have to navigate here on the top right of the window and go to new space to create a new space. Let's decide for a name for the space. So we're going to call it Bleep Image Captioning API. We're going to put the default license. We're going to put the default license. We're going to put the default license. Select the gridio as the space SDK and select the basic hardware and put it public so that everyone can use it. All right. So once you have created that space, you need to create two files, which are the requirements.txt file to list all the required libraries that you need to run the space and the main file that you need to call app.py. So let's do that right now by creating first the requirements file and put all our requirements inside that txt file. So we'll add strata. We'll add transformers because we need transformers, pytorch and gridio. All right. So before creating the app.py file, let's quickly go back to our lab and create a new cell and first try out the demo locally before pushing it on the hub. So as seen in the previous lab, we're going to leverage the pipeline object from transformers and use gridio as a demo by leveraging the gridio.interface. So as usual, we're going to use gridio.interface. As usual, load the pipeline. But this time we're going to load the image to text pipeline and we're going to load the blip image captioning based model in order to perform image captioning. All right. So now that the model has been loaded, as we've been doing so far for all our gridio demos, we need to define a method that we'll call launch here that will take the input, call the pipeline, get the output and get the generated text from the output. And note here, we're using the global.txt. So we're going to use the globally defined pipeline that we have loaded here so that we won't have to instantiate a new pipeline each time we're calling launch. That way, we just have to define the initialization of pipeline once at the beginning of the script so that it's loaded only once. And we use that globally defined model or pipeline inside the launch function. Then let's define our gridio.interface that will look like this with inputs being gridio.image, output being text. And we call interface.launch with shape.input.image. And we call interface.launch with shape.input.image. All right, so it's looking pretty good. Now we can enable it. And right now, we're going to see that we can enable from the navigation tab, but I can also estou extract live under the gridio. revenge of two motherfucker un Arm . I'll use the rom key because I have Cities theme, and leave the songs as they are. And we can also use that stub to gladly import those sounds. So as I write my songs within the gridio, What if I want to deploy a larger model or what if I don't want to deploy that model locally on my local computer because I want to use my computer to do something else? This is possible using spaces and that's what we're going to see in our lab. So once we have confirmed that the app works locally, we're going to export that app directly into the created Hagingface spaces. So we're just going to copy paste all these blocks and paste it in the newly created app.py file on the space that we have created. So let's just do it right now. So we're going to create a new file app.py and copy paste everything. Note that alternatively you can also git clone the space locally and do everything through git. Here we're just doing everything through the command line through the UI for easiness. So we're just going to do this. And then we're just going to do the same thing. So we're just going to do the same thing. So we're just going to do the same thing. Here we're going to remove share equal true. And we'll just have to wait the app to build. Okay so if everything went well you'll end up with something like this. So if the app is successfully running you should have that on your newly created space. We can also test it out right now. To see if it works. All right so we got the same results as the test that we've been doing locally. Now how can I use that space as an external API? So if you look into this window there is use via API select box here that you can click. And we just have to you know follow the instructions here. So first pip install radio client. So that's that's something we already did in our lab. And we just need to retrieve this link. Which is a temporary link that gets changed I think each 24 hours. So you just have to instantiate this client object with this correct link. And call client.predict then local path to the image or an url or also a pill image object. And make sure to call API name equal predict. And then you can directly print the results. So let's copy paste this snippet and try that out in our lab. So let's try that out. Perfect. So if we inspect the input image. So indeed it's a red bus with some blue stripes on the side. So if you want also to further inspect what's in your API. You can also call the view API method directly on the client. To get a better idea of what's in your API. To get more information on the API. So it also gives you some information on how to call the API. So through API name equal predict. The expected parameters and also the expected return type of the API. Perfect. So we were able to call our model as an API outside our local machine. So everything is done on the cloud. Feel free to try that space out or build a new space for your own use case. Or try out new models. New implements. And so on. So before moving forward. There is also another feature that I wanted to show you. So just to complete my explanation about the client API. You can also make the space private. In case you want to host a private model. You just have to make sure that the space is private when going to the setting here. So by making the space private. And then you just have to pass your Hugging Face token. When you instantiate the client. So in terms of code. It would look something like this. So you will instantiate your client with the argument hftoken your token. So that way you'll have access to the private space as an API. So how can we push this further? So we've been deploying this demo so far on a CPU instance on the Hugging Face space. But what if I want to deploy a much larger model that cannot fit on a CPU instance. Or what if the CPU instance is a bit slow. And I want to deploy it on a CPU instance. So there is a feature you can use within the Hugging Face ecosystem called GPU zero space. Where you can basically spin some free GPUs on demand for your spaces. So let's see how to do this. You need to first go to this Hugging Face organization called zero GPU explorers. Just have to browse zero dash GPU dash explorers on Hugging Face hub. And request to join this organization in order to get access to this feature. Once your request has been accepted you directly have access to the zero GPU feature. Whenever you create a new space you'll have the option zero GPU appearing here with dimension free. So to demonstrate the zero GPU feature I've created a space on my personal account for the lava model. So lava model is an image to text model which is quite large. So the lava 1.57b that is deployed on the space the demonstration space. Has approximately 16 gigabytes and needs quite a lot of GPU RAM to be run. And the space lives here and as you can see it says running on a zero device. So let's use that as an API. So we just have to click use via API here. Okay so we can copy paste this snippet and try it out on our local machine. And instead of using the default prompt and image. We're going to use a custom image that we have prepared for you. So. Remember this image of the instructors having dinner all together in Palo Alto here. So we're going to use this image and ask the model if it's able to predict what we are all holding in our forks. So let's try that out right now. So we're going to prompt the model in such a way that we're going to explicitly ask what we are holding in our forks. So we're just going to say. These people. Are having dinner in a Mediterranean restaurant. To hint a bit that we're having falafels. Can you guess what they are all holding in their forks? And yeah recall that you can either pass an image URL or the path to your image. So we're going to pass the path to the image. And wait for the results. Perfect so let's see what the model has predicted. So. So it says that we're all eating meatballs. At least the model has predicted that it's something that is round. But it was not unfortunately able to detect that it was falafel despite the hint that we gave. But yeah I guess that's a bit challenging maybe even for humans since it's round and a bit brown. So yeah. But still I think the description is quite accurate. And it's quite funny how the model behaved with respect to the prompt and the image. All right so to wrap up this lesson I invite you to explore. Hanging face spaces in order to deploy your custom demos. You can also browse the spaces web page to check out the spaces of the week. To get some inspiration and cool applications and demos that you can build and easily share. So I invite you to have a look at all the things that we have covered during this whole course. And come up with some cool ideas that you can publish easily on Hanging face spaces. And share it with your friends and colleagues. All right so there is just one more video where we'll say thank you. And wrap up the course. So let's go on to the next lesson.