Skip to main content

PoolParty Extractor - Customize Configuration Files

04/11/2025

For stop word elimination, lemmatization of terms and rule-based person name recognition, the PoolParty Extractor (PPX) uses wordform lists, which you can customize.

PoolParty includes wordform files for English (en) and German (de). These files are used for English and German extraction, including locale variations such as en-US and de-AT. Stopword files are also provided for the following languages: de, en, es, fi, fr, it, nl, no, pt, and sv.

The word list for rule-based entity recognition contains first names.

Note

Defaults and custom configuration files are exclusive. If there is for instance a stop words language file for English (stopwords/en.txt), the stop word filter uses just this file ignoring the default.

As soon as a file gets loaded by an incoming request, the file is cached in memory. If the files are modified on disk, you need to restart the PoolParty server for the changes to take effect.