Congratulations. You've made it. You've learned how to isolate any part of an image using SAM. You've also prompted OWL-ViT in order to segment objects in an image using only text. You combined object detection, image segmentation, and guided diffusion to programmatically create entirely new images. You also fine tune stable diffusion with the dreambooth method, so that it can generate images of a custom object. If you have access to GPUs, such as with Google Colab, you can try iterating on your own prompts and fine tuning the models further with GPUs. Perhaps you can expand these inpainting pipeline methods to video editing. Now that there are more foundational models, you'll have more opportunities to apply prompt engineering to your computer vision applications. We look forward to seeing what you'll build next.