Skip to main content

The 'Extract' Call

Abstract

The 'Extract' Call

This section provides details on running an 'extract' call illustrating one of the main functionalities of the PoolParty Extractor.

Note

Whenever you make changes to your PoolParty Thesaurus project, you will have to refresh the extraction model first before executing the extractor again.

Tip

You can also use tools like Curl or Postman for executing the calls to the API.

The method used for this call is 'extract'. We will use the following parameters in our sample call:

  • 'text' to provide the text we wish to annotate.

  • 'projectId' to specify the ID of the PoolParty project we want to use for extraction.

  • 'language' to define the language that should be used for extraction. If no language is specified, the language will be detected from the text and will then be used as the extraction language.

  • The parameter 'numberOfTerms' defines the number of free terms we want to get back (set here to '0' to focus on the concept annotations).

In our sample 'extract' call we will enter the following request in the address bar of our web browser:

Request

{{url}}/extractor/api/extract?text=A Spritz Veneziano also called just Spritz or just Veneziano, is an Italian wine-based cocktail, commonly served as an aperitif in northeast Italy. The drink originated in Venice while it was part of the Austrian Empire, and is based on the Austrian Spritzer, a combination of equal parts white wine and soda water.&projectId={{project}}&language=en&numberOfTerms=0

The following results will be returned:

Results

{
    "concepts": [
        {
            "id": "1E034541-9963-0001-EE48-B5D068201D43:https://nextrelease-cons.semantic-web.at/cocktails/2c682ed8-e2ba-473e-8cb7-979598080e18@en",
            "project": "1E034541-9963-0001-EE48-B5D068201D43",
            "score": 100,
            "uri": "https://nextrelease-cons.semantic-web.at/cocktails/2c682ed8-e2ba-473e-8cb7-979598080e18",
            "language": "en",
            "prefLabel": "Spritz Veneziano",
            "altLabels": [
                "Spritz"
            ],
            "conceptSchemes": [
                {
                    "uri": "https://nextrelease-cons.semantic-web.at/cocktails/8d052dfc-44bf-4985-8ce3-4564570a161b",
                    "title": "Cocktails"
                }
            ],
            "frequencyInDocument": 2
        },
        {
            "id": "1E034541-9963-0001-EE48-B5D068201D43:https://nextrelease-cons.semantic-web.at/cocktails/b523727e-7f49-4e1b-9c09-55002ee3a81e@en",
            "project": "1E034541-9963-0001-EE48-B5D068201D43",
            "score": 59,
            "uri": "https://nextrelease-cons.semantic-web.at/cocktails/b523727e-7f49-4e1b-9c09-55002ee3a81e",
            "language": "en",
            "prefLabel": "Apéritif and digestif",
            "altLabels": [
                "Digestivo",
                "Aperitif and digestif",
                "Aperitif",
                "Aperitivo",
                "Apéritif",
                "Digestif",
                "Apero",
                "Apertif",
                "Apéro",
                "After-dinner drink"
            ],
            "conceptSchemes": [
                {
                    "uri": "https://nextrelease-cons.semantic-web.at/cocktails/8d052dfc-44bf-4985-8ce3-4564570a161b",
                    "title": "Cocktails"
                }
            ],
            "frequencyInDocument": 2
        },
        {
            "id": "1E034541-9963-0001-EE48-B5D068201D43:https://nextrelease-cons.semantic-web.at/cocktails/75b586dd-6bd9-4894-a258-90007061c029@en",
            "project": "1E034541-9963-0001-EE48-B5D068201D43",
            "score": 24,
            "uri": "https://nextrelease-cons.semantic-web.at/cocktails/75b586dd-6bd9-4894-a258-90007061c029",
            "language": "en",
            "prefLabel": "Drink",
            "conceptSchemes": [
                {
                    "uri": "https://nextrelease-cons.semantic-web.at/cocktails/591cf89a-57af-49b8-9042-3fc77408c93e",
                    "title": "Beverages"
                }
            ],
            "frequencyInDocument": 1
        },
        {
            "id": "1E034541-9963-0001-EE48-B5D068201D43:https://nextrelease-cons.semantic-web.at/cocktails/df4ff163-a7c8-4c40-b556-f9390fd97972@en",
            "project": "1E034541-9963-0001-EE48-B5D068201D43",
            "score": 6,
            "uri": "https://nextrelease-cons.semantic-web.at/cocktails/df4ff163-a7c8-4c40-b556-f9390fd97972",
            "language": "en",
            "prefLabel": "White wine",
            "altLabels": [
                "White wines"
            ],
            "conceptSchemes": [
                {
                    "uri": "https://nextrelease-cons.semantic-web.at/cocktails/591cf89a-57af-49b8-9042-3fc77408c93e",
                    "title": "Beverages"
                }
            ],
            "frequencyInDocument": 1
        },
        {
            "id": "1E034541-9963-0001-EE48-B5D068201D43:https://nextrelease-cons.semantic-web.at/cocktails/7448dbed-7603-41f7-a316-99f754c9ae45@en",
            "project": "1E034541-9963-0001-EE48-B5D068201D43",
            "score": 5,
            "uri": "https://nextrelease-cons.semantic-web.at/cocktails/7448dbed-7603-41f7-a316-99f754c9ae45",
            "language": "en",
            "prefLabel": "Soda water",
            "altLabels": [
                "Sparkling water",
                "Two Cents Plain",
                "Club Soda",
                "Unflavored Soda",
                "Seltzer water",
                "Sparkling Water",
                "L'eau avec gaz",
                "Carbonated water",
                "Eau avec gaz",
                "Fizzy water",
                "Soda-water",
                "Carbonate water",
                "Club soda",
                "Carbonated waters",
                "Bubbly water"
            ],
            "conceptSchemes": [
                {
                    "uri": "https://nextrelease-cons.semantic-web.at/cocktails/591cf89a-57af-49b8-9042-3fc77408c93e",
                    "title": "Beverages"
                }
            ],
            "frequencyInDocument": 1
        }
    ]
}

What we see here is the information on the concepts present in the thesaurus found in the annotated text.

In our call using the sample projects the detected concepts are 'Spritz Veneziano','Apéritif and digestif','Drink','White wine', and 'Soda water'.

The following details are shown for each concept:

  • the PoolParty project it originates from (relevant when more than one project is used for annotation at the same time),

  • the score for the annotation of that concept in a document or text fragments,

  • its URI,

  • the language used for the detection (since a concept can have labels in multiple languages),

  • all SKOS labels of the concept (preferred, alternative and hidden) in the detected language,

  • the concept schemes the concept belongs to,

  • the frequency of its occurrence in the annotated text.

The score ranges from 1 to 100 where a higher score means that the concept is more relevant for the processed text. The score is influenced by two factors, the frequency where a higher number of occurrences leads to a higher score, and by the position in the text where occurrences produce a higher score the closer they are located to the beginning of the text. This means that concepts appearing more often and positioned closer to the beginning of the annotated text receive higher scores.