Categorization Service

The categorization service produces a list of categories based on the detected concepts from a thesaurus. Matched concepts are mapped to the top concepts they are related to (via skos:broader relationships) and the scores of all concepts that match to the same top concept are integrated to a compound score.

This API call accepts plain text, a web page referenced by a URL, and an uploaded file as input.

Find the following pages in this section:

Web Service Method: Categorize a Text

Description

[text] Categorizes a given text by first extracting concepts and then aggregating them by their associated top concepts.

The aggregated top concepts represent the categories; the more concepts are found in the text from a specific top concept, the higher this top concept's score will be summing up to 100%.

URL: /extractor/api/categorization

Request

Supported Methods

POST

GET

Content-Type

application/x-www-form-urlencoded

HTTP Parameter

Parameter

Type

Required

Description

conceptSchemeFilters

Array of String

false

Concept scheme URI filters

customClassFilters

Array of String

false

Custom class URI filters

disambiguation

boolean

false

Use disambiguation. If not supplied, defaults = false

displayText

boolean

false

Include text extracted from file in response, default = false

extraConceptLanguages

Array of PPLocale

false

Additional languages used for concept extraction (en|de|es|fr|...)

language

PPLocale

false

Language of text (en|de|es|fr|...)

projectId

Array of String

true

Thesaurus projectIds

scoringAlgorithm

String

false

Scoring algorithm to use (simple|ppxBoost). If not supplied, defaults to simple

text

String

true

Text of the document

title

String

false

Title of the document

useCustomAttributes

boolean

false

Retrieve custom attributes, default = false

useCustomRelations

boolean

false

Retrieve custom relations, default = false

A PPLocale Object - Attributes

Attribute

Type

Required

ALL_LANGUAGES

PPLocale

false

DUTCH

PPLocale

false

ENGLISH

PPLocale

false

FRENCH

PPLocale

false

GERMAN

PPLocale

false

RUSSIAN

PPLocale

false

SPANISH

PPLocale

false

VALID

PPLocale

false

country

String

false

language

String

false

languageTag

String

false

ResponseContent Type

application/json

The Categorization Response Object

Attribute

Type

Required

Comment

categories

Array of Category

false

Categories found in text

text

String

false

Text as extracted from URL or file

title

String

false

Title as extracted from URL or file

Example Response
{
  "categories" : [ {
    "score" : 0.009903094654109657,
    "prefLabel" : "some prefLabel",
    "categoryConceptResults" : [ {
      "score" : 0.9287647967299457,
      "prefLabel" : "some prefLabel",
      "uri" : "https://semantic-web.com/api/uri#21636"
    }, {
      "score" : 0.5401813598657076,
      "prefLabel" : "some prefLabel",
      "uri" : "https://semantic-web.com/api/uri#12030"
    } ],
    "uri" : "https://semantic-web.com/api/uri#5366"
  } ],
  "text" : "some text",
  "title" : "All about Chuck Norris"
}
ExamplesInput Text

A gin and tonic is a highball cocktail made with gin and tonic water poured over ice. It is usually garnished with a slice or wedge of lime. The amount of gin varies according to taste. Suggested ratios of gin to tonic are between 1:1 and 1:3.

In some countries (e.g. UK), gin and tonic is also marketed pre-mixed in single-serving cans.

Self-made gin and tonic from Bombay Sapphire London Dry Gin and Schweppes Indian Tonic, garnished with slices of lime.

The drink is a particular phenomenon as its taste is quite different from the taste of its constituent liquids which are rather bitter. The chemical structures of both ingredients are of a similar molecular shape and attract each other, shielding the bitter taste.

API Call for this Text
https://nextrelease.poolparty.biz/extractor/api/categorization?language=en&projectId=1E034541-5BC3-0001-C454-2ED019578460&text=A%20gin%20and%20tonic%20is%20a%20highball%20cocktail%20made%20with%20gin%20and%20tonic%20water%20poured%20over%20ice.%20It%20is%20usually%20garnished%20with%20a%20slice%20or%20wedge%20of%20lime.%20The%20amount%20of%20gin%20varies%20according%20to%20taste.%20Suggested%20ratios%20of%20gin%20to%20tonic%20are%20between%201:1%20and%201:3.%20%20In%20some%20countries%20(e.g.%20UK),%20gin%20and%20tonic%20is%20also%20marketed%20pre-mixed%20in%20single-serving%20cans.%20%20%20Self-made%20gin%20and%20tonic%20from%20Bombay%20Sapphire%20London%20Dry%20Gin%20and%20Schweppes%20Indian%20Tonic,%20garnished%20with%20slices%20of%20lime.%20The%20drink%20is%20a%20particular%20phenomenon%20as%20its%20taste%20is%20quite%20different%20from%20the%20taste%20of%20its%20constituent%20liquids%20which%20are%20rather%20bitter.%20The%20chemical%20structures%20of%20both%20ingredients%20are%20of%20a%20similar%20molecular%20shape%20and%20attract%20each%20other,%20shielding%20the%20bitter%20taste.

Example Response

{
    "categories": [
        {
            "prefLabel": "Alcoholic beverage",
            "uri": "https://nextrelease-cons.semantic-web.at/cocktails/f3000285-36b0-4ffe-af90-740c2dd8fff5",
            "score": 0.4,
            "categoryConceptResults": [
                {
                    "uri": "https://nextrelease-cons.semantic-web.at/cocktails/d40e2215-bc1b-4bae-8cf8-d10b1ad36d44",
                    "prefLabel": "Bitters",
                    "score": 5
                },
                {
                    "uri": "https://nextrelease-cons.semantic-web.at/cocktails/5d96c50c-ee48-4c40-bb4c-4b9a42d11de6",
                    "prefLabel": "Gin",
                    "score": 100
                }
            ]
        },
        {
            "prefLabel": "Distilled beverage",
            "uri": "https://nextrelease-cons.semantic-web.at/cocktails/403d1249-f37f-4f43-bebf-8dde9677d886",
            "score": 0.4,
            "categoryConceptResults": [
                {
                    "uri": "https://nextrelease-cons.semantic-web.at/cocktails/d40e2215-bc1b-4bae-8cf8-d10b1ad36d44",
                    "prefLabel": "Bitters",
                    "score": 5
                },
                {
                    "uri": "https://nextrelease-cons.semantic-web.at/cocktails/5d96c50c-ee48-4c40-bb4c-4b9a42d11de6",
                    "prefLabel": "Gin",
                    "score": 100
                }
            ]
        },
        {
            "prefLabel": "Fruit",
            "uri": "https://nextrelease-cons.semantic-web.at/cocktails/4f20d6bb-710d-4870-bde4-b6e835d7d13f",
            "score": 0.2,
            "categoryConceptResults": [
                {
                    "uri": "https://nextrelease-cons.semantic-web.at/cocktails/021c85c0-a60a-4414-8fd8-bd3383b794f1",
                    "prefLabel": "Lime",
                    "score": 23
                }
            ]
        },
        {
            "prefLabel": "Beverages",
            "uri": "https://nextrelease-cons.semantic-web.at/cocktails/c56ced28-5ecd-4436-b152-25edf326c07c",
            "score": 0.2,
            "categoryConceptResults": [
                {
                    "uri": "https://nextrelease-cons.semantic-web.at/cocktails/75b586dd-6bd9-4894-a258-90007061c029",
                    "prefLabel": "Drink",
                    "score": 6
                }
            ]
        },
        {
            "prefLabel": "Non-alcoholic beverage",
            "uri": "https://nextrelease-cons.semantic-web.at/cocktails/e86c1671-4a67-494b-ae5d-bcb750865acc",
            "score": 0.2,
            "categoryConceptResults": [
                {
                    "uri": "https://nextrelease-cons.semantic-web.at/cocktails/84c72e0e-7830-46ef-ac35-44cad56d27ec",
                    "prefLabel": "Water",
                    "score": 20
                }
            ]
        }
    ]
}

Web Service Method: Categorize a File

Description

[file] Categorizes a given file by first extracting concepts and then aggregating them by their associated top concepts.

The aggregated top concepts represent the categories; the more concepts are found in the file from a specific top concept, the higher this top concept's score will be summing up to 100%.

URL: /extractor/api/categorization

Content-Type

Content-Type: application/x-www-form-urlencoded

Request

Supported Methods

POST

GET

HTTP Parameter

Parameter

Type

Required

Description

conceptSchemeFilters

Array of String

false

Concept scheme URI filters

customClassFilters

Array of String

false

Custom class URI filters

disambiguation

boolean

false

Use disambiguation. If not supplied, default = false

displayText

boolean

false

Include text extracted from file in response, default = false

file

MultipartFile

true

File to be categorized (word, excel, powerpoint, pdf, open document)

language

String

false

Language of text (en|de|es|fr|...)

projectId

Array of String

true

Thesaurus projectIds

scoringAlgorithm

String

false

Scoring algorithm to use (simple|ppxBoost). If not supplied, defaults to simple

Response

This method returns execution results in JSON format.

Categorization response

Attribute

Type

Comment

categories

Array of Category

Categories found in text

text

String

Text as extracted from url or file

title

String

Title as extracted from url or file

Categorization result

Attribute

Type

Comment

categoryConceptResults

Array of ConceptCategory

Categorized concepts

prefLabel

String

Preferred label

score

double

Score

uri

String

Uri

Categorized concept

Attribute

Type

Comment

prefLabel

String

Preferred label

score

double

Score

uri

String

Uri

Web Service Method: Categorize a URL

Description

[url] Categorizes a given URL by first extracting concepts and then aggregating them by their associated top concepts.

The aggregated top concepts represent the categories; the more concepts are found in the URL from a specific top concept, the higher this top concept's score will be summing up to 100%.

URL: /extractor/api/categorization

Request

Supported Methods

POST

GET

Content-Type

Content-Type: application/x-www-form-urlencoded

HTTP Parameter

Parameter

Type

Required

Description

conceptSchemeFilters

Array of String

false

Concept scheme URI filters

customClassFilters

Array of String

false

Custom class URI filters

disambiguation

boolean

false

Use disambiguation. If not supplied, defaults is false.

displayText

boolean

false

Include text extracted from url in response, default = false

language

String

false

Language of text (en|de|es|fr|...)

projectId

Array of String

true

Thesaurus projectIds

scoringAlgorithm

String

false

Scoring algorithm to use (simple|ppxBoost). If not supplied, defaults to simple.

url

String

true

Url to document be categorized

Response

This method returns execution results in JSON format.

Categorization response

Attribute

Type

Comment

categories

Array of Category

Categories found in text

text

String

Text as extracted from url or file

title

String

Title as extracted from url or file

Categorization result

Attribute

Type

Comment

categoryConceptResults

Array of ConceptCategory

Categorized concepts

prefLabel

String

Preferred label

score

double

Score

uri

String

Uri

Categorized concept

Attribute

Type

Comment

prefLabel

String

Preferred label

score

double

Score

uri

String

Uri