This section contains a glossary for special UnifiedViews terms.
a database where the data is stored during the pipeline execution.
Objects accessible to user
objects may be accessible pipelines, accessible errors, etc., are objects which user can perform actions with (view, change, delete, etc.).
an oriented acyclic graph, where nodes represent DPU instances, and oriented edges represent data flow between these instances. Graph may not be empty and must contain at least one DPU instance. Each DPU instance has associated configuration. Furthermore, pipeline has a few additional properties: name, description, permissions, owner and optionally parent pipeline. Pipeline is a persisted entity.
an area on the screen, which is used for designing pipeline graph.
Pipeline execution, also: Pipeline run
an entity representing one single execution of pipeline. It is created by user action (clicking "run" or "debug" pipeline), or by scheduler. In scheduler, user may define a periodicity of how often runs are created by the scheduler.
a pipeline a data manager may operate with - either it is his pipeline (he is the author of that pipeline) or it he was granted permission to view or control it.
Data Processing Unit (DPU)
a persisted entity, which comprises of its properties, .jar file, and default configuration. It's primary function depends on its type (see section Types). Generally it takes input data, processes it, and saves the output to specified data structure. a module on the pipeline.
a categorization of DPUs by their function. There are 3 basic types: Extractor, Transformer, Loader. Each basic type may further have subtypes, e.g. RDF extractor, HTML extractor, linker.
a categorization of DPUs by their function, more granular than DPU type. E.g.: RDF extractor, HTML extractor, linker.
a placement of DPU on a pipeline. DPU instance is therefore created when DPU is placed on pipeline canvas.
an associative array of key-value pairs, which customize functionality of DPU instance.
defines structure and other properties of DPU's configuration form.
a default DPU configuration associated with each DPU (previously called "default configuration"). At the time of placing DPU on the pipeline (canvas), configuration template is copied and becomes instance configuration.
a configuration for specific DPU instance.
a named graph in an external database or graph in staging database, which may be access from different pipelines.
a named graph that can be accessed only by DPUs on the same pipeline execution. It cannot be accessed by DPU on any other pipeline or on the same pipeline but in a different instance of the pipeline execution. DPU defines its input and output as sets of local graphs. We distinguish input and output local graphs. Each local graph is supplemented with information about the DPU which created it and its template ID (e.g. Linked DPU defines that output graph A contains links with the high probability to be correct and B links which needs verification.)
an RDF database containing data, which are considered clean and valid. After pipeline processes data with no purpose of further processing, such data is usually loaded into knowledge base.