RDF elementary guide part 1: Class and property definition in RDFS

The Resource Description Framework (RDF) is an open standard from W3C to describing digital resources with semantic meaning. Data described in RDF format can be exchanged and reused with retained conceptual understanding of resources between businesses, industries and countries. This is the first article in a series of guides to get started describing data with semantic meaning. The reader is recommended to read W3C guide RDF 1.1 Primer to take advantage of the article series, and gives a good introduction of what the RDF framework consists of. The first article describes how resources are defined in RDFS. Next article presents a semantic model and explains how resources from Wikidata and other sources can be linked to enrich the description.

RDF-Schema (RDFS)

Resources are described by creating classes and properties, for example with RDF schema (RDFS). RDFS is one of several languages that offers notation rules for describing resources with semantic meaning to create simpler vocabulary and taxonomies. Common to all languages within RDF is the description of resources in the form of a Statement which consists of subject, predicate and object which is called triplet. All resources in RDF are identified using International Resource Identifier (IRI), which is a generalisation of URI (ASCII) that allows UFT-8 characters (Unicode transformation format). Subsequent guides present other languages to define more advanced classification systems in the form of ontologies that can describe temporal and dynamic conditions for digital resources.

RDF Format

To define a statement in RDF, there are a number of different formats that has different purposes. The Turtle (.ttl) format is a syntax that has abbreviations, prefixes and is easier to read for people and used is used through the article series unless otherwise stated. Other formats in the Turtle family are more compact and optimized formats for machines, such as N-Triples, Q-Quads.

Classes and properties in RDFS

Classes (rdfs: Class) are used to classify resources. An instance of an rdfs: Class is defined using the predicate rdf: type. For example, we can define that Artist is a class and that Picasso is an instance of the Artist class. Note that ex, rdf and rdfs are IRI prefixes and are abbreviations instead of writing the entire path to resources (http://www.clearbyte.org/example/Artist).

ex:Artist rdf:type rdfs:Class . 
ex:Picasso rdf:type ex:Artist .

Properties (rdfs: Property) is used to add attributes to classes. Similar to how we define classes, we can define instances of properties to add attributes to statements. In the example from earlier we add the properties (predicate) ex: name and ex: created. Name is defined by a text string (literal) “Pablo Picasso” and creator (ex: creatorOf) of an object in the form of an instance of the ex: Guernica class. Note that both names and creators are objects in the staement of Picasso, where name is a text string and creators a object which is a subject in another statement. This could be in the same model or if we want to use the definition of the painting Guernica in Wikidata’s (Wikipedia) by using an IRI prefix for the resource wd: Q175036 (https://www.wikidata.org/wiki/Q175036). Turtle offers the abbreviation of rdf: type in the form of the letter a, which makes the syntax shorter and easier to read.

ex:name a rdfs:Property .
ex:creatorOf a rdfs:Property .

ex:Picasso a ex:Artist ;
    ex:name “Pablo Picasso”;
    ex:creatorOf ex:Guernica.

Predicate interconnect the subject and the object in a statement which forms the basis of a graph. Subjects and objects can be seen as nodes and the predicate as a meaningful link that describes the relationship between the nodes.

Note that the definition of properties usually starts with a lowercase letter and classes with a uppercase.

Domain & Range

To semantically describe and derive relationships between subjects and objects, rdfs: domain and rdfs: range are used. The predicate rdfs: domain declares that a property belongs to one or more classes. For example, we can define that property P belongs to class D.

P rdfs:domain D .

ex:hasMother rdfs:domain ex:Person ; 
ex:frank ex:hasMother ex:frances .

The example implicitly derives that ex: frank also belongs to the ex: Person class because ex: hasMother belongs to the ex: Person class.

To deduce that the value of an instance belongs to one or more classes / properties, the predicate rdfs: range is used. For example, we can define that the value of P belongs to the class R.

P rdfs:range R .

The difference between domain and range is that the first declares that a property belongs to a domain of one or more classes. And that range declares that a property belongs to one or more instances of classes. The following statement illustrates the difference. The example defines two classes, book and person, and the author property. The author property belongs to the domain of Book. But when instantiating a book, the name of the author refers to the class Person.

ex:Book a rdfs:Class .
ex:Person a rdfs:Class .
ex:author a rdf:Property .

ex:author rdfs:domain ex:Book .
ex:author rdfs:range ex:Person .

The example below derives that the value of the ex:motherTo property belongs to both the ex: Female and ex: Person class.

ex:motherTo rdfs:range ex:Female . 
ex:motherTo rdfs:range ex:Person .

The following statement describes that Eva is the mother to Pete and implicitly also a woman and a person.

ex:Eva ex:motherTo ex:Pete .

Definitions of properties using rdfs: range can also be used to describe the data type such as integers, decimal numbers et cetera if it is not a text string which is the default.

ex:age rdf:type rdf:Property . 
ex:age rdfs:range xsd:integer .

The ability to declare properties belonging to a specific class (domain) or a selection (range) of instances of classes makes it possible to draw conclusions (inference) through implicit The ability to declare properties belonging to a specific class (domain) or a selection (range) of instances of classes makes it possible to draw conclusions (inference) through implicit links between resources. Deriving implicit connections between resources enables logical reasoning and is a powerful feature of the RDF framework. The next RDF elementry guide presents a semantic data model that describes classes and properties of artists and paintings, and links in related descriptions from Wikidata that also describe resources in RDF format.

Congruent organisational structure – digital value creation part 2

In order for businesses to gain added value from partnerships, they need to be structured and adapted to collaborate. Which facilitate quick and efficient implementation of decision-making processes, project change, knowledge transfer et cetera. Congruent structures are the second general mechanism identified in the study of data and information exchange within a partnership between train operators and the Swedish Transport Administration. The first mechanism highlights interoperability for digital resources and infrastructure.

Continue reading “Congruent organisational structure – digital value creation part 2”

Interoperability for digital resource and infrastructure – Digital value creation part 1

A recently completed study examined railway operators perspectives on digital value creation within the partnership between Swedish Transport Administration and national railway operators for data and information exchange. The first mechanism identified of three, highlights value creation of data and information exchange and alignment of information system used within the collaboration.

Continue reading “Interoperability for digital resource and infrastructure – Digital value creation part 1”

Review of Trafikverket open API for traffic information

Profilbild Trafikverket

The Swedish transport administration authority Trafikverket, offering several open data services. One of these is the API for traffic information, which contains data and information for nation-wide train and road traffic. The API began as a information service for train, which was later on expanded to include road data. Our reviews of open APIs are part of an effort to highlight barriers and requirements from a user perspective. We hope that the reviews providing constructive feedback to the data owners, and inspire others by showing examples and solutions. Read more about the background to why we are reviewing open APIs and open data sources.
Continue reading “Review of Trafikverket open API for traffic information”

Review of Västtrafik open API for public transport

vt-logo-digitala-kanaler

Västtrafik handles and coordinate all public transport in west Sweden with the city of Gothenburg as its main transit hub. Västtrafik offers several APIs to search and plan journeys by train, tram, ferries and bus in West Sweden. The API can be found at Västtrafiks development portal (in Swedish), which serves as a focal point for their open API service. The aim of our reviews of open API:s is to shed light on common obstacles and requirements from a user perspective. Note that the portal around the API is in Swedish, but the documentation and the API syntax is in English. The review will try to guide user with no knowledge in Swedish on how to get started. Continue reading “Review of Västtrafik open API for public transport”

User demand driven and machine-readable open data

water-drop

Open data is undergoing a paradigm change where the focus is shifting to user demand driven publication of data in machine-readable formats, with open standards and licenses that is appropriate for its application area. This is often refereed to as “liquid information” or “liquid data” which can be read about in this report from McKinsey’s 2013. The report address the potential value that can be achieved if standards, formats and metadata are functional for its intended use. Open data 2.0 is another emerging term which refers to data that is being made available based on demand and provides means for participation and collaboration, where users can report suggestions for improvement and provide feedback on flawed data. Continue reading “User demand driven and machine-readable open data”