PoolParty Concept Extractor
PoolParty Concept Extractor
PoolParty Concept Extractor is a DPU / plugin for UnifiedViews to consume the Concept Extraction service provided by PoolParty Extractor. Given triples with string literal objects representing texts or files containing texts as input, this extractor annotates texts against a thesaurus project in PoolParty and produces annotations in RDF triples as output.
Please refer to the following documentation for more information about PoolParty Extractor.
Name | Description | Data Type | Example |
---|---|---|---|
Host | Resolvable host name or IP address of the target PoolParty server | String | |
Port | Port number of PoolParty server | Integer | 80 |
Extraction service path | PoolParty Concept Extraction service path relative to PoolParty service root URL | String | /extractor/api/annotate |
Project ID | Project identifier of the PoolParty thesaurus project to be extracted against | String | 12345678-1234-1234-1234-ABCDEF123456 |
Language code | Two-digit ISO 639-1 code of source language of the texts to be extracted | String | en |
Username | Account name of a user for the target PoolParty thesaurus server | String | test |
Password | Password of a user for the target PoolParty thesaurus server | String | **** |
Corpus ID | Identifier of a corpus in the project used to adapt scores with corpus analysis | String | 12345678-1234-1234-1234-ABCDEF123456 |
Number of terms to return | Maximum number of terms to return | Integer | 0 |
Number of concepts to return | Maximum number of concepts to return | Integer | 50 |
useTransitiveBroaderConcepts | Retrieve transitive broader concepts of the extracted concepts | Boolean | false |
useTransitiveBroaderTopConcepts | Retrieve transitive broader top concepts of the extracted concepts | Boolean | false |
useRelatedConcepts | Retrieve related concepts of the extracted concepts | Boolean | false |
filterNestedConcepts | Nested concept filter removes concepts matches which are contained within other matches | Boolean | true |
tfidfScoring | The scores of the concepts and terms are weighted by tfidf (term frequency-inverse document frequency) formula | Boolean | false |
useTypes | Retrieve the custom types for concepts | Boolean | false |
Maximum retry times for failed extraction | Maximum retry times for failed extraction | Integer | 3 |
Use HTTPS | If checked, HTTPS is used for connecting to target PPX service (by default false) | Boolean | false |
Use only symbolic names when creating resulting URIs from input files | If checked, virtual path metadata is not used when forming URIs for outputted resources, but symbolic names are used | Boolean | false |