Skip to main content

PoolParty Concept Extractor

Abstract

PoolParty Concept Extractor

DescriptionPoolParty Concept Extractor (uv-t-poolpartyConceptExtractor):

PoolParty Concept Extractor is a DPU / plugin for UnifiedViews to consume the Concept Extraction service provided by PoolParty Extractor. Given triples with string literal objects representing texts or files containing texts as input, this extractor annotates texts against a thesaurus project in PoolParty and produces annotations in RDF triples as output.

Please refer to the following documentation for more information about PoolParty Extractor.

Configuration Parameters

Name

Description

Data Type

Example

Host

Resolvable host name or IP address of the target PoolParty server

String

test.poolparty.biz

Port

Port number of PoolParty server

Integer

80

Extraction service path

PoolParty Concept Extraction service path relative to PoolParty service root URL

String

/extractor/api/annotate

Project ID

Project identifier of the PoolParty thesaurus project to be extracted against

String

12345678-1234-1234-1234-ABCDEF123456

Language code

Two-digit ISO 639-1 code of source language of the texts to be extracted

String

en

Username

Account name of a user for the target PoolParty thesaurus server

String

test

Password

Password of a user for the target PoolParty thesaurus server

String

****

Corpus ID

Identifier of a corpus in the project used to adapt scores with corpus analysis

String

12345678-1234-1234-1234-ABCDEF123456

Number of terms to return

Maximum number of terms to return

Integer

0

Number of concepts to return

Maximum number of concepts to return

Integer

50

useTransitiveBroaderConcepts

Retrieve transitive broader concepts of the extracted concepts

Boolean

false

useTransitiveBroaderTopConcepts

Retrieve transitive broader top concepts of the extracted concepts

Boolean

false

useRelatedConcepts

Retrieve related concepts of the extracted concepts

Boolean

false

filterNestedConcepts

Nested concept filter removes concepts matches which are contained within other matches

Boolean

true

tfidfScoring

The scores of the concepts and terms are weighted by tfidf (term frequency-inverse document frequency) formula

Boolean

false

useTypes

Retrieve the custom types for concepts

Boolean

false

Maximum retry times for failed extraction

Maximum retry times for failed extraction

Integer

3

Use HTTPS

If checked, HTTPS is used for connecting to target PPX service (by default false)

Boolean

false

Use only symbolic names when creating resulting URIs from input files

If checked, virtual path metadata is not used when forming URIs for outputted resources, but symbolic names are used

Boolean

false