Entity Extractor Architecture

Find here an overview of the PoolParty Extractor architecture. 

The following diagram displays a high level overview of PoolParty Extractor's modular architecture:

ppx-architecture

 

Extraction engines are organized as units with specific functionalities that build on the output of previous units and add specific results to the output stream. The main functionalities are realized in the Term Extractor and the Term Matcher. The Term extractor detects specific pieces of the text that are characterized as potential term candidates. The Term Matcher has then the mission to match the candidates to the thesaurus model and resolve conflicting matches. The Pre and Post Processors prepare the text for processing and clean up the results before generating the final output.