Skip to main content

Create a Train Classifier

Abstract

Create a Train Classifier

This section provides details on how to create and train a Train Classifier in PoolParty's Semantic Classifier (SC).

A Train Classifier is meant for you to set up a classifier and train it for later to classify any number of documents.

As already mentioned before, this could be any text files the PoolParty extractor supports that you then classify into the categories you created before.

Note

PoolParty supports all common text based file types, for example MS-Office, OpenOffice, pdf, xml, etc.

The following has to be in place in order for you to be able to use the classifier:

  • A PoolParty Enterprise Server or Semantic Integrator license with Semantic Classifier add-on included.

  • An opened PoolParty thesaurus project you created.

After you have created a Training Box, you create a Train Classifier.

Note

Creating a Training Box is optional, you can add documents to the classifier itself directly. But adding them to a Training Box first enables you to reuse them in any classifier of your choice.

How to Set Up a Train Classifier

  1. Select the Train Classifiers node, right click or double click it to use the context menu or click Create Classifier on the right.

  2. The New Document Classifier dialogue will open. Add a name and select a language for it from the drop down.

  3. Click OK to confirm your changes.

23900870.png

At the top of the Classifiers Details View you can search for specific classifiers. The following options are available:

  • Enter a name or search string of a classifier in the Search field.

  • In the Min. Amount of Classes you can restrict the search to number of classes the resulting classifiers should contain at least.

  • The Min. Performance (%) field allows to restrict search results by the performance values of a classifier.

  • The Status drop down offers a further limitation on results as to the calculation status of a classifier: All, New, Calculated and Outdated are available values.