Skip to main content

GraphSearch on RDF Dataset

Abstract

GraphSearch on RDF Dataset

After following the guidelines described in GraphSearch Configuration to configure the data source, a knowledge engineer or data architect can construct the view of the RDF dataset in GraphSearch and facets for search filtering.

Facet in GraphSearch

An RDF dataset consists of a collection of RDF resources possibly with internal and external references to form a whole. A search facet is a set of rules and conditions partitioning the data into multiple dimensions and later allowing facetted search users to narrow the navigation down to a specific fragment. Faceting on RDF data can be simply implemented to categorize RDF resources by predicate and object values. In GraphSearch, a facet on RDF data can be created with three flavors by classes, relation and attribute.

Faceting by Classes

A resource can be an instance of one or multiple classes. Faceting by classes will simply categorize resource by classes. For example, given the following RDF data represented in Turtle:

urn:person1 a <http://schema.org/Person> .
urn:org1 a <http://schema.org/Organization> .

You can build a facet with two facet members represented by classes <http://schema.org/Person> and <http://schema.org/Organization>. Then only resource <urn:person1> will be presented to the user if facet member <http://schema.org/Person > is selected.

Faceting by Relation

A resource can link to another resource via a relation. Faceting by a relation will apply to resources containing the relation and categorize selected resources by their object resources.

Example

Given the following RDF data represented in Turtle:

urn:person1 <http://schema.org/memberOf> urn:org1 .
urn:person2 <http://schema.org/memberOf> urn:org2 .

You can build a facet with two facet members represented by resources urn:org1 and urn:org2. Then only resource <urn:person1> will be presented to the user if facet member urn:org1 is selected.

Faceting by Attribute

A resource can have attribute values. Faceting by attribute will apply to resources containing the attribute and categorize selected resources by their values. Values can be textual, numeric, temporal or boolean.

Example

Given the following RDF data represented in Turtle:

urn:person1 <http://schema.org/sex> "Male" .
urn:person2 <http://schema.org/sex> "Female" .

You can build a facet with two facet members represented by values "Male" and "Female". Then only resource <urn:person1> will be presented to the user if facet member "Male" is selected.

Custom Scheme as Facet Definition

An RDF dataset consists of multiple classes, relations and attributes in general. One has to define facets out of them by creating custom schemes in PoolParty. Please refer to Custom Scheme & Ontology Management for more information about custom schemes and ontologies and follow the guide to create a custom ontology. Then follow the instructions Create Custom Classes, Create Custom Relations, Create Custom Attributes to model an ontology of the RDF dataset by creating classes, relations and attributes, which will become facets in GraphSearch automatically. All classes will create one facet. Each relation or attribute will create one facet. Note that only classes, relations and attributes which are used as facets should be defined in the custom ontology. After creating the custom ontology, one should create a custom scheme from the ontology, which is explained in Create Custom Schemes from an Ontology.

Configure Facet Definition in GraphSearch

After navigating to the Admin GUI of GraphSearch at http://{SERVER_URL}/GraphSearch/admin, one can verify the connection to the remote database. If the connection is successful, the database will be marked as UP, as shown below:

Figure 1. System info

If the connection is successful, navigate to the Project tab to check out the configuration options. To configure facets, click the Project Configuration button, as shown below:

Figure 2. Configuration

In the pop-up dialog, all the custom schemes available in PoolParty will be listed as candidate. The one used for facet definition should be selected:

Figure 3. Facet configuration

After loading the facet configuration, GraphSearch will display all configured facets in the SearchFields tab:

Figure 4. Facet info

Then the facets are activated successfully.

Field Mapping Definition

Search result of RDF resources are presented as documents in a list, with title and description fields providing some information about resources. In the "Mappings" tab (see Figure 5), one can specify the predicate URIs of resources representing the attribute values which can be used as title and description.

For example, given the following RDF data represented in Turtle

urn:org1 a <http://schema.org/Organization> .
urn:org1 <http://schema.org/name> "Semantic Web Company" .
urn:org1 <http://schema.org/description> "Semantic Web Company is the leading provider of graph-based metadata, search, and analytic solutions." .

If you want to use <http://schema.org/name> as title and <http://schema.org/description> as description, those two predicate URIs have to be added into the corresponding fields of the mapping list. For each field, multiple predicates can be provided as an ordered list, and the first valid value will be used.

Given the description field in the image below, for example, the object value of skos:definition will be displayed as description in the search result. If it does not exist, then rdfs:comment will be used. When no object of any predicate exists in the dataset, the field will become empty.

Note

At least one predicate URI has to be specified for each field.

Figure 5. Mapping info

Last but not the least, when a facet from a relation exists, it is also possible to specify a predicate URI of which a value can be used to represent the object resource in a more human readable way. Otherwise, facet members in the facet list will be displayed as URIs of object resource. This configuration is also integrated with the predicate configuration of the title field. So the predicates listed in title field actually defines two views in the same time.

Example

Given the following RDF data represented in Turtle:

urn:person1 <http://schema.org/memberOf> urn:org1 .
urn:org1 <http://schema.org/name> "Semantic Web Company" .

When relation <http://schema.org/memberOf> is a facet, urn:org1 will be displayed as a facet member by default.

However, if <http://schema.org/name> is added as a qualified predicate for title field, Value "Semantic Web Company" will be displayed instead of URI urn:org1, which can improve some usability of the application.