Hello. In the previous lecture we have discussed the importance of structure row data in order to better represent the information, on the domain of interest. In this lecture we will see some specific tools, to implement such process. Remind that, we have data, namely raw facts, on one side of this pyramid, this arrow. On the other side, we have the knowledge on a specific domain. What we want to do, is we want to use this knowledge. In order to structure the data, namely to provide information, regarding the specific domain. Okay, today we will focus on the data model, namely, on the description of the data, together with the properties and relationships, that allows the implementation of such a process. In particular, you can imagine there are a number of possibilities, to implement such process. We will focus on a specific data model. The so called RDF, Resource Description Framework. So, RDF is a data model, for the description of resources on the web. Remind that our, cultural object, will be digitized, in order to be available over the network. And so RDF, is a good candidate, to represent such digital, to describe such digital objects. What is the main idea of RDF? The main idea is that you have the model, that is basically made of statements, about the resources, in the form of subject, predicate, and object. Okay? So you have these triples. As you can see, this triple can be conveniently represented by a graph. In which you have the subject, the red circle, in this particular case, is the book we were, we used also in the previous lecture. So this book, has type, so the relationship is type, book. And book is the object, okay. This book has title, Ebla An Empire Rediscovered. Okay, those are, the kind of relationships of statements that you can represent by RDF. An important concept here, is the concept of Unified Resource Identifier. We need a unique identifier, in order to, identify the resources on the Web. So, this unique identifier can be described by a suffix, let's have a look here on this DC type, what is DC type? DC is the suffix for, Dublin Core. So, if you click here. You go to this page and you see that the Dublin Core is a vocabulary, of some properties for the use in resource description. And in particular let's have a look to what is the the type. The type here. Okay. The nature of a resource. Okay? This is, a book, so the nature is a book, and here you see again the title, Ebla An Empire Rediscovered. So if you want to have a look to the possible suffixes available over the network, you can go to the site. You can, basically, write here one possible suffix- Well, let's say DC as before. Look up. And you see you have a number of addresses that, describe these particular resource, okay? So, in particular, here is again the Dublin Core Initiative. Okay. Let's see another, more complete example, taken from the European data model primer. We will go into details of this example, but even if you are at the beginning with the RDF it's pretty intuitive right? You have two distinct subjects. One, is the Mona Lisa, the famous painting by Leonardo da Vinci. The other one is Leonardo. Those two, concepts are, connected by a relationship and indeed, the creator of Mona Lisa is Leonardo, as you see the arrow, and then you have that Leonardo. As an example, you have the date of birth, the date of death. You have the fact that Mona Lisa has a title, Portrait of Mona Lisa. The current location is Paris. So, even if you do not know the details of RDF, having a look to this graph should give you some insights, about the knowledge. Okay? So, this is the idea, providing you you insights about the knowledge on a specific domain. Okay but, the graph we saw before is understandable by humans, but we need something that makes it also understandable by machine. So, this process, of encoding RDF in a format that is machine readable, is called serialization. Okay? There are a number of different serialization formats as an example, just to give you a rough idea, you might have turtle family of RDF languages, JSON, RDF, RDF XML. Okay. So, a number of possibilities, however, irrespectively from the way you decide to serialize your RDF graph, the triples, namely the knowledge, is exactly the same. And thus, all those representations are equivalent are logically equivalent. Okay, for the sake of simplicity, and also because it's connected with the next with the content of the next lecture, we will focus on XML. What is XML? XML stands for extensible mark-up language. So, is a language that defines, a set of rules for encoding documents, in a format that is both human and machine readable. Okay, but what is a markup? Markup is a way through which you annotate the document. In a way through which you can distinguish data from the description of data. Okay? So, markup is a convenient tool to distinguish between the data and the meta-data, and this is an example we already saw, okay? Paolo Matthiae is the data. But we know that Paolo Matthiae is the author of the title, Ebla An Empire Rediscovered. And so, to distinguish Paolo Matthiae the data, from the fact that we describe that Paolo Matthiae is the author, we need this syntax. Okay? So, syntactically we can distinguish between data and metadata. In your daily life you have other examples of Markup Language just to clarify this, let's click on this URL. Okay this is the web page of university of, Sapienza University of Rome. If you click here in html. This is the Markup language, used to describe the content of this webpage, okay? So basically this, that Markup language allows you, to represent the content of this webpage. Good. Okay. Just to clarify, everything, let's say that informally our knowledge is Bob is a person and is a friend of Alice. Okay. And we want to represent this knowledge, in RDF through this graph. Now, when, what we can do is we can represent that graph in XML version so that, that content, namely the XML file, can be understandable both by humans and by machine. And why is it so important to be understandable by machine, because all the services that we have in mind like indexing, searching and so on and so forth are implemented on machines and thus we need a mechanism that can be understandable by machine. Good. Okay. The five star open data initiative, is an initiative that, tries to leverage on such concepts, in order to make available data and knowledge on those data, openly to everybody. Okay. Why five stars? Because, depending on the status of the process, in which you are, you get stars. Let's have a look. First star, to get one star, what do you have to do? Basically, you have to put your data, available on the web, in an open license, namely everybody can access your data. Two stars, is not only sufficient to allow everybody to access your data. But you need also to structure data, right? We have seen before that this is extremely important. Two stars, if you structure your data. Three stars if you use non-proprietary format, to structure your data. Four stars, if you use URI. Namely, unified resource identifiers, to denote things, so that, people from outside, can link to your stuff, okay? Can point to your stuff. Namely, can use that principle we saw before in RDF. Finally. You can also link your data, to other providers in order to improve the quality of the representation of your data. Those are the five stars. Okay, this is a picture that shows you how knowledge on completely different domains, such as, as an example. The life science, the pink one, and I don't know, geographic, the yellow one or government, the green one, are connected. This is a pretty interesting picture that, allows you to have an understanding on the richness that you can achieve when you connect the data, when you link the data. In open format. Okay, concluding. We have seen now two specific tools, namely RDF and XML that allow us to, effectively represent a data model. In the next slides, we will focus on the Europeana data model, namely the data model used to describe cultural objects.