Skip to main content

Web Service Method: Categorize a File

Abstract

Web Service Method: Categorize a File

Description

[file] Categorizes a given file by first extracting concepts and then aggregating them by their associated top concepts.

The aggregated top concepts represent the categories; the more concepts are found in the file from a specific top concept, the higher this top concept's score will be summing up to 100%.

URL: /extractor/api/categorization

Content-Type

Content-Type: application/x-www-form-urlencoded

Request

Supported Methods

POST

GET

HTTP Parameter

Parameter

Type

Required

Description

conceptSchemeFilters

Array of String

false

Concept scheme URI filters

customClassFilters

Array of String

false

Custom class URI filters

disambiguation

boolean

false

Use disambiguation. If not supplied, default = false

displayText

boolean

false

Include text extracted from file in response, default = false

file

MultipartFile

true

File to be categorized (word, excel, powerpoint, pdf, open document)

language

String

false

Language of text (en|de|es|fr|...)

projectId

Array of String

true

Thesaurus projectIds

scoringAlgorithm

String

false

Scoring algorithm to use (simple|ppxBoost). If not supplied, defaults to simple

Response

This method returns execution results in JSON format.

Categorization response

Attribute

Type

Comment

categories

Array of Category

Categories found in text

text

String

Text as extracted from url or file

title

String

Title as extracted from url or file

Categorization result

Attribute

Type

Comment

categoryConceptResults

Array of ConceptCategory

Categorized concepts

prefLabel

String

Preferred label

score

double

Score

uri

String

Uri

Categorized concept

Attribute

Type

Comment

prefLabel

String

Preferred label

score

double

Score

uri

String

Uri