Skip to main content

TF-IDF Corpus Creation Service

Abstract

TF-IDF Corpus Creation Service

This service calculates the TF-IDF (term frequency-inverse document frequency) statistics for concepts and free terms in a selected corpus. When you use it in the Extraction Service, the following calculations will be done:

  • Concepts and terms are weighted by the TF-IDF values and the score of document specific terms/concepts are boosted.

  • The scores of common terms/concepts (in the given domain/corpus) are decreased.

For each project TF-IDF statistics can be calculated based on one corpus.

Details on the TD-IDF corpus creation find here: Web Service Method: Create a TF-IDF Corpus