Skip to main content

RDF Data Processing Performance Optimization

Abstract

RDF Data Processing Performance Optimization

This section contains a short guide on how to optimize the RDF data processing in your custom DPUs.

Note

In general, the rule of thumb is that inputs (input data graphs) to the DPUs are read-only and you must not directly modify them directly or indirectly.

Nevertheless, to increase performance of RDF data processing (for example for SPARQL Update queries), in certain cases you may want to directly change input data graphs without copying the data graphs to the output data unit first.

In order to do that, as of version 2.1.8 (UnifiedViews-Plugin-devEnv), you may call the following method on your UserExecutionContext (ctx) inside the DPU class to check whether you may directly change input data graphs, such as, whether you can create new entries in the output data unit. These would reference the original data graphs introduced in the inputRdfDataUnit and directly work with them or modify them.

if (ctx.isPerformanceOptimizationEnabled(inputRdfDataUnit)) {
   //you MAY create new entries in the output data unit, which reference the original data graphs introduced in the inputRdfDataUnit and directly work with them/modify them.
}
else {
   // you MUST NOT directly modify the data graphs within input data unit
}

If the DPU is runnable in the optimistic mode (the call), you can create new entries in the output data unit, which reference the original data graphs introduced in the inputRdfDataUnit and directly work with them or modify them.

See t-sparqlUpdate for further use of the optimistic mode.