UnifiedViews Glossary
Accessible pipeline

A pipeline a data manager may operate with - either it is his pipeline (he is the author of that pipeline) or it he was granted permission to view or control it.

Configuration Skeleton

Defines structure and other properties of DPU's configuration form.

Configuration Template

A default DPU configuration associated with each DPU (previously called "default configuration"). At the time of placing DPU on the pipeline (canvas), configuration template is copied and becomes instance configuration.

Data Processing Unit (DPU)

Persisted entity, which comprises of its properties, .jar file, and default configuration. It's primary function depends on its type (see section Types). Generally it takes input data, processes it, and saves the output to specified data structure. a module on the pipeline.

DPU configuration

An associative array of key-value pairs, which customize functionality of DPU instance.

DPU subtype

A categorization of DPUs by their function, more granular than DPU type. E.g.: RDF extractor, HTML extractor, linker.

DPU type

A categorization of DPUs by their function. There are 3 basic types: Extractor, Transformer, Loader. Each basic type may further have subtypes, e.g. RDF extractor, HTML extractor, linker.

DPU instance

A placement of DPU on a pipeline. DPU instance is therefore created when DPU is placed on pipeline canvas.

Global graph

A named graph in an external database or graph in staging database, which may be access from different pipelines.

Instance Configuration

A configuration for specific DPU instance.

Knowledge Base

An RDF database containing data, which are considered clean and valid. After pipeline processes data with no purpose of further processing, such data is usually loaded into knowledge base.

Local graph

A named graph that can be accessed only by DPUs on the same pipeline execution. It cannot be accessed by DPU on any other pipeline or on the same pipeline but in a different instance of the pipeline execution. DPU defines its input and output as sets of local graphs. We distinguish input and output local graphs. Each local graph is supplemented with information about the DPU which created it and its template ID (e.g. Linked DPU defines that output graph A contains links with the high probability to be correct and B links which needs verification.)

Objects accessible to user

Objects may be accessible pipelines, accessible errors, etc., are objects which user can perform actions with (view, change, delete, etc.).

Pipeline

An oriented acyclic graph, where nodes represent DPU instances, and oriented edges represent data flow between these instances. Graph may not be empty and must contain at least one DPU instance. Each DPU instance has associated configuration. Furthermore, pipeline has a few additional properties: name, description, permissions, owner and optionally parent pipeline. Pipeline is a persisted entity.

Pipeline canvas

An area on the screen, which is used for designing pipeline graph.

Pipeline execution, also: Pipeline run

An entity representing one single execution of pipeline. It is created by user action (clicking "run" or "debug" pipeline), or by scheduler. In scheduler, user may define a periodicity of how often runs are created by the scheduler.

Staging database

A database where the data is stored during the pipeline execution.