Skip to main content

Working with the Relations Data Unit

Abstract

Working with the Relations Data Unit

This section contains a short guide on how Relational data unit entries (table) may be obtained from or written to input Relations data units.

For basic information about data units, please see the basic description of data units.

Reading Tables From Input Relational Data Unit

Please prepare teh DPU 'MyDpu' as described in Working with the Relations Data Unit.

To read tables from the input data unit, one has to define the input Relational data unit.

Code 1 - defining input data unit

 @DataUnit.AsInput(name = "input")
 public RelationalDataUnit input;

All data units must be public with proper annotation: they must at least contain a name, which will be the name visible in the UnifiedViews administration interface for pipeline developers. The code above goes to the main DPU class.

In order to work with input data unit, you typically use RelationalHelper class (eu.unifiedviews.helpers.dataunit.relational.RelationalHelper in uv-dataunit-helpers ).

RelationalHelper class provides methods to get a list of tables, the input Relational data unit contains to operate with them:

  • static Set<RelationalDataUnit.Entry> getTables(RelationalDataUnit relationalDataUnit) throws DataUnitException

    • This method returns set of entries (Relational tables) in the given relationalDataUnit.

  • static Set<RelationalDataUnit.Entry> getTablesMap(RelationalDataUnit relationalDataUnit) throws DataUnitException

    • This method returns map of entries (Relational tables) in the given relationalDataUnit. In this case the key for each map entry is the symbolic name of the table.

Code 2 shows how the method for getting tables can be used (The code below goes to innerExecute() method of the DPU).

  • Line 2 returns set of table entries.

Code 2 - Iterating over input RDF graphs using RDFHelper

try {
        Set<RelationalDataUnit.Entry> tableEntries = RelationalHelper.getTables(input);
} catch (DataUnitException ex) {
   throw ContextUtils.dpuException(ctx, ex, "dpuName.error");
}

By having the set of tables, you may then iterate over tableEntries as follows:

Code 3 - Getting dataset

try {
    Set<RelationalDataUnit.Entry> tables = RelationalHelper.getTables(inputTables);
    for (RelationalDataUnit.Entry table : tables) {
            String tableName = table.getTableName();
    }
} catch (DataUnitException e) {
    throw ContextUtils.dpuException(ctx, e, "dpuName.error");
}

By getting data from the input tables, the best is to obtain an SQL Connection to the input data unit input:

Code 4 - Getting connection

Connection databaseConnection = input.getDatabaseConnection();

Writing Relational Tables to the Output Relational Data Unit

Please prepare DPU 'MyDpu' as described in Working with the Relations Data Unit. To write tables to the output data unit, one has to define the output Relational data unit.

Code 4 - defining output data unit

@DataUnit.AsOutput(name = "output")
public WritableRelationalDataUnit output;

All data units must be public with proper annotation: they must at least contain a name, which will be the name visible in the UnifiedViews administration interface for pipeline developers. The code above goes to the Main DPU class.

In order to work with output data unit, you use directly the methods from the data unit WritableRelationalDataUnit API:

  • String addNewDatabaseTable(String symbolicName) throws DataUnitException;

    • This method creates new table in the output data unit with the given symbolicName. The name of the created table is generated based on the symbolic name provided, so that it is unique. For explanation of symbolicNames and other metadata of entries in data units, please see Basic Concepts for DPU Developers .

  • void addExistingDatabaseTable(String symbolicName,String dbTableName) throws DataUnitException;

    • This method adds existing table with the given name dbTableName to the output data unit. It automatically creates new entry in the output data unit with the given symbolicName. For explanation of symbolicNames and other metadata of entries in data units, please see Basic Concepts for DPU Developers . It is checked that such table does not already exists in the output data unit. If the answer is 'yes', an exception is thrown.

Code 5 - new table

output.addNewDatabaseTable("myOutputTable"); 

Configuration for Relational Data Unit - Database Connection

There are default settings. If those are not enough, you may customize them in the config.properties file:

  • database.dataunit.sql.type

    • file (default)

    • inMemory

  • database.dataunit.sql.baseurl

    • jdbc:h2:file: (default)

  • database.dataunit.sql.user

    • filesUser (default)

  • database.dataunit.sql.password

By default UnifiedViews uses H2 database in embedded mode, using files to store the tables.

Technical Notes

Database file name (file DB mode):

dbFileName.append("dataUnitDb");
dbFileName.append("_");
dbFileName.append(this.executionId);

TODO: RelationalHelper class (eu.unifiedviews.helpers.dataunit.rdf.RelationalHelper in uv-dataunit-helpers ) for writing tables.