Skip to main content

Write PoolParty Extractor Results Into a Graph Database

Abstract

Write PoolParty Extractor Results Into a Graph Database

When a graph database is configured as remote repository then this service can be used to annotate documents and write the results directly into the graph database.

  • Method: annotateAndStore

  • URL: /extractor/api/annotate/store

This API call accepts plain text, a web page referenced by an URL, and an uploaded file as input.

Plain text input

Supported Methods

GET

POST

Specific HTTP Parameters

Parameter

Type

Required

Value range

Comment

text

String

true

The text to be used for the extraction request.

title

String

false

The title of the document.

Web pages as input

Supported Methods

GET

POST

Specific HTTP Parameters

Parameter

Type

Required

Value range

Comment

url

String

true

The Url to the document be used for the extraction request.

File as input

Supported Methods

POST

The Mimetype of request must be 'multipart/form-data'.

Specific HTTP Parameters

Parameter

Type

Required

Value range

Comment

file

MultipartFile

true

The file to be uploaded for the extraction request. Supported input formats are Word, Excel, Powerpoint, Pdf, Open Document Format.

Common HTTP Parameters

Parameter

Type

Required

Value range

Comment

projectId

String

true

The unique identifier of the PoolParty project to use for the extraction (the UUID of the project e.g. "d06bd0f8-03e4-45e0-8683-fed428fca242")

text

String

true

The text to be used for the extraction request.

documentUri

String

true

A URI to identify the document.

graphName

String

false

The URI of the graph in the graph database where the results should be written to. If not specified a new graph with the name of the document will be created.

language

String

true

The language of the text (e.g. "en", "de", "es", "fr", ...).

Note

A stop word list is only available for the following languages: en (english), de (german), fr (french). Other languages can be added on demand.

CJK languages are not supported.

transitiveBroaderConcepts

boolean

false

  • true

  • false

Retrieve transitive broader concepts.

  • true - The URIs of transitive broader concepts are returned along with the extracted concepts.

  • false - No transitive broaders are returned (default)

Depending on the depth of the thesaurus hierarchy this option can return a large number of transitive broaders per concept. Only set this parameter totrue if you really need the information.

transitiveBroaderTopConcepts

boolean

false

  • true

  • false

Retrieve transitive broader top concepts.

  • true - The URIs of transitive broader concepts that are top concepts are returned.

  • false - No transitive broader top concepts are returned (default)

relatedConcepts

boolean

false

  • true

  • false

Retrieve related concepts.

  • true - The URIs of the related concepts are returned.

  • false - No related concepts are returned (default)

numberOfConcepts

Integer

false

The number of concepts to be retrieved.

numberOfTerms

Integer

false

The number of terms to be retrieved.

This service generates an RDF graph of for the results in the same way as the annotate service that is written into the installed graph database. A document URI has to be specified for each document that is used to identify the documents in the store. If a graph name is provided the results are written to that graph (useful if one processes document sets). If not graph name is provided the results for each document are written into a separate graph based on the document URI.