Knowledge Graphs for RAG - DeepLearning.AI

Loading...

Welcome back!

We'd like to know you better so we can create more relevant courses. What do you do for work?

Subscribe to receive AI news, events and course updates from DeepLearning.AI!

Course Syllabus

AI Python for Beginners is a sequence of 0 connected courses. You can navigate to the other courses by clicking on the cards below

Explore Courses
Community
My Learnings

You’ve achieved today’s streak!

Complete one lesson every day to keep the streak going.

Su

Mo

Tu

We

Th

Fr

Sa

You earned a Free Pass!

Free Passes help protect your daily streak. Complete more lessons to earn up to 3 Free Passes.

In this lesson, you'll use the Cypher query language to interact with a fun Knowledge Graph that contains data about actors and movies. Let's dive in. Okay. To get this notebook started, we're going to import some packages that we need for just getting everything set up for being able to access Neo4j and getting the environment. Those are classic packages. We're going to get from Also, we're going to be loading from Langchain the Neo4j graph class itself which is how we're going to be accessing Neo4j. And of course, you know, loading from.env and then setting up some variables here. This very first one, the Neo4j URI is basically the connection string for like where's Neo4j located at what port, those typical kinds of things. We of course need a username, and a password, And so, let's run that as well. Okay, now, that we have our environment set up, Okay, now, the notebook is ready to start sending On the left here, we've got some person nodes. We know that they acted in some movies. The actor becomes an actor because they've acted in something. If you can read that out as a sentence, of course, it's that a person acted in a movie. That's the fundamental pattern that we'll be looking for when we're dealing with this dataset. For both of those nodes, for the persons and the movies, we know that they have properties, the properties that are available for the person or that they all have names. They also have a born value which is just the The movies have a title and tagline, both as strings, and they also have a release date, also just an integer about the year the movie was released. And finally, just in the same way that I mentioned that a person acted in a movie, there's more relationships that a person might have with a movie. We know that they've acted in a movie. A person might also have directed that movie. Sometimes for some movies, we know that a person both acted in and directed a movie. Also, they might have written a movie. So, That's all the different relationships between the persons and the movies within this data set. But then, the persons themselves have a relationship with other persons. Here is the idea that if somebody has reviewed a movie, somebody else might be a follower of that reviewer. And so, What those persons' roles are, the relationships around them to the movies and to each other persons is what really determines the type of a person that they are or the job that they have or their behaviors within the data set. It's worth observing, of course, that these are potential relationships. The actual relationships themselves for particular people on particular movies ends up being dynamic based on the data itself. So now, you have some idea about what the graph looks like. And the querying that we're gonna be doing is based on Cypher. Cypher is Neo4j's query language. It is using pattern matching to find things inside of the grass. And it looks like this. Here, And in that query, each of the rows is then a dictionary where the dictionary keys are based on what you've returned from the return clause up here in the cipher. So here, we've been returning count N. So we got a key for count N and the value of that is 171. That's a little bit friendly, right? Cool. Now, that we know that we're getting that back, I can run this as well. Okay, so with that cipher match that we just did, we looked for a pattern of all the nodes within the graph, returned a count of that. What if we don't want to find all the nodes in the graph, but just the movies or just the persons? The movies and the persons show up as labels on nodes. So, We'll start with the same cipher query we had earlier. And this small change we have to make is instead of this end being all by itself, we're gonna add a colon. And then, we're gonna say movie. We'll just say it's the number of movies. Running that then of course shows us the number of movies are 38 within this dataset. Pretty good, just enough for us to play with. That's great. We got the result we expected and that's really nice. There's one more small change we can make here to this query to improve the readability of the query. We've been using this variable n to capture all of the patterns that match these nodes. But we know that we're grabbing movies, and movies begin with the letter m instead of n. Let's go ahead and change that. We can use m colon movie. And then return, account to those m's. Running this again gets us the same result all we've done is change the name of the variable to help readability of the query. We're again gonna do a count with those nodes and we'll rename the result to number of people. And let's run that. Cool, perfect. So now, for any set of nodes within the graph, When you're exploring the graph, you can of course add more match criteria. For instance, looking for a very specific value. Value-based criteria are introduced with object notation inside of curly braces. So for example, if we want to find not just all people, but a very specific person, we'll just make a copy of this and modify it. And we don't need a count of Tom Hanks. Let's just go ahead and return Tom himself. We found Tom Hanks, we can see when he was born, and that's all the properties we have on this particular node. You can of course do the same thing if you happen to know that there's a movie that you're looking for. Let's copy all these over and we'll change it. Instead looking for a person, Everything is connected. Why it's so close to my heart, of course. The title is Cloud Atlas and it was released way back in 2012. For these last couple of queries, we've been looking for specific nodes, maybe based on some exact match criteria. If we didn't want to return the entire node of Cloud Atlas and just, let's say, the release date, we can return just that. Let's make a copy of this query and just modify the return clause. So here, Cloud Atlas will be all of the nodes that have this title Cloud Atlas. We know that they have a title and they also have a release date. And now, we get a dictionary back in this list that has For the patterns, we've been describing so far, Let's say you want to do for these movies, this is kind of a classic set of movies we've got here. Let's find all the movies that are back from the 90s. In Toad, the cipher query looks like this. The first part is exactly what we've done before. We have a match clause here with just a single node pattern. And now, we're going to use a where clause to We'll do that with a slightly bigger pattern. The Cypher query itself starts off very similar to what we had before. We're gonna match clause. It's gonna match some actors who we know have the label person. And now, here's the fun part of the relationships that we're introducing. For the patterns that are matched inside of the database that have those actors and those movies, we're just going to turn the actor name and the movie title, and we'll limit the results to just 10. Cool. So, now, we have a bunch of actor's names and movies they've been in. Of course, And you probably recognize some of these names, but maybe not this first name. I'm not sure that's quite right. Let's come back to that later. If there's a particular actor we care about and finding out what movies they've had, we can do that as well using some of the conditional matching that we looked at before. We're going to say there's a person whose name is Tom Hanks, and that he acted in some Tom Hanks movies. Let's return that person's name, Tom.name, and also the TomHanksMovies.titles. Fantastic. Now, we see that we've got Tom Hanks and all While we're dealing with Tom Hanks, it might be interesting to extend this pattern a little bit even further. The pattern that we have right now is from a person who acted in a movie. We can think about, well, who else acted in that movie? Tom acted in some movies, we're just gonna call him. And here, we're not even gonna use the label for the movies. We know that the person who acted in something is gonna end up in some movies. And then, coming in from the other direction, And as before, kg.queer to run this bit of Cypher. Cool. So, here's all the people that have acted with Tom Hanks in various movies. Maybe not quite Kevin Bacon long, but quite long. You may recall that earlier we noticed an actor named Emil Efrem in the Matrix movie. Emil is not an actor in the Matrix. He's actually the founder of Neo4j. We can leave him in the Knowledge Graph because he is a person, Let's return Emil's name and the title of those movies. Okay, so he only has one claim for being in the matrix. You can take the query we just had because the matching is perfect, but instead of returning some data, we're gonna use a new clause called delete. And delete will delete whatever we want to from the database. Cool. We didn't return any results from this query so the result set itself is actually empty. Let's verify that Emil is now gone and is no longer an actor in the movie. Exactly what we want to see. Let's take a look at creating data. Creating data is very similar to doing matching on single nodes. Let's take a look at, for instance, creating a new person. I'll create just myself. If Emil is going to be in this Knowledge Graph, I can be there too. Instead of a match clause, we're now going to have a create clause. We're going to create an Andreas. We're going to give it a label person. And we're going to give Andreas a name property where the value is Andreas. And then, we'll return the node we just created. And just like that, I'm part of the Knowledge Graph too. We can take this one more step. Adding relationships is a little bit different than adding nodes because of relationships contain two nodes. The first step is to have a Cypher query where you're going to find the nodes you want to create a relationship between. This pattern is a little bit different than we had before. Let's take a look at it more closely. We're going to match an Andreas as a label person and a name Andreas. And then, We're also then going to find an Emil who also is labeled with a person where the name is Emil Ephraim. Having found those two nodes, we'll then merge a new relationship in. Merge is very similar to the create, except that if the relationship already exists, it won't be created again. Let's return that Andreas, the relationship and Emil. Great. So in this lesson, you learned the fundamentals of creating a knowledge graph using the Cypher language. You're working towards building a RAG application, and this requires text embeddings of some sort. In the next lesson, you'll see how to transform any text fields in your graph into vector embeddings and add those to the graph to enable vector similarity search. Let's move on to the next video to get started.

course detail

How Was Your Experience

Thank you for taking the time to provide feedback on your course experience! Please take a moment to rate the course and share any comments you may have.

Would you recommend this short course to people in your network? (0=Not likely, 10=Extremely likely)
012345678910
Feedback about the Course:
Feedback about the Platform:

Loading...

Learn Code

Next Lesson

Knowledge Graphs for RAG

Introduction

Knowledge Graph Fundamentals

Querying Knowledge Graphs

Preparing Text for RAG

Constructing a Knowledge Graph from Text Documents

Adding Relationships to the SEC Knowledge Graph

Expanding the SEC Knowledge Graph

Chatting with the Knowledge Graph

Conclusion

Course Feedback

Community

0%