PoolParty Concept Extractor


PoolParty Concept Extractor (uv-t-poolpartyConceptExtractor):

PoolParty Concept Extractor is a DPU / plugin for UnifiedViews to consume the Concept Extraction service provided by PoolParty Extractor. Given triples with string literal objects representing texts or files containing texts as input, this extractor annotates texts against a thesaurus project in PoolParty and produces annotations in RDF triples as output.

Please refer to the following documentation for more information about PoolParty Extractor.

Configuration Parameters

NameDescriptionData TypeExample
HostResolvable host name or IP address of the target PoolParty serverStringtest.poolparty.biz
PortPort number of PoolParty serverInteger80
Extraction service pathPoolParty Concept Extraction service path relative to PoolParty service root URLString/extractor/api/annotate
Project IDProject identifier of the PoolParty thesaurus project to be extracted againstString12345678-1234-1234-1234-ABCDEF123456
Language codeTwo-digit ISO 639-1 code of source language of the texts to be extractedStringen
UsernameAccount name of a user for the target PoolParty thesaurus serverStringtest
PasswordPassword of a user for the target PoolParty thesaurus serverString****
Corpus IDIdentifier of a corpus in the project used to adapt scores with corpus analysisString12345678-1234-1234-1234-ABCDEF123456
Number of terms to returnMaximum number of terms to returnInteger0
Number of concepts to returnMaximum number of concepts to returnInteger50
useTransitiveBroaderConceptsRetrieve transitive broader concepts of the extracted conceptsBooleanfalse
useTransitiveBroaderTopConceptsRetrieve transitive broader top concepts of the extracted conceptsBooleanfalse
useRelatedConceptsRetrieve related concepts of the extracted conceptsBooleanfalse
filterNestedConceptsNested concept filter removes concepts matches which are contained within other matchesBooleantrue
tfidfScoringThe scores of the concepts and terms are weighted by tfidf (term frequency-inverse document frequency) formulaBooleanfalse
useTypesRetrieve the custom types for conceptsBooleanfalse
Maximum retry times for failed extractionMaximum retry times for failed extractionInteger3
Use HTTPSIf checked, HTTPS is used for connecting to target PPX service (by default false)Booleanfalse
Use only symbolic names when creating resulting URIs from input filesIf checked, virtual path metadata is not used when forming URIs for outputted resources, but symbolic names are usedBooleanfalse