Named Entity Recognition

Another interesting functionality of the Extractor is to detect certain types of entities on its own.

Currently it does so for people, organisations and locations. The parameter consists of two parts:

  • The base is called 'nerParameters'. Additionally, there are the 'method' and the 'type' part.

  • The method here is 'MAXIMUM_ENTROPY' and the available types are 'person', 'organization' and 'location'.

To make sure the system knows which of the parts go together, we need to use an index such as 'nerParameters[0].method=MAXIMUM_ENTROPY'and 'nerParameters[0].type=person'. The index are integer numbers, they need to start at 0 and then increment consecutively. One further detail is that parameters in the method invocation have to be URL encoded which affects the brackets of the index.

The following call shows how to specify all three types of entities:

Request

{{url}}/extractor/api/extract?text=Chris Stemman, the executive director of the British Coffee Association, says most of those techniques from decaffeination’s earliest days are still being used today. But the process isn’t as straightforward as you’d expect. “It isn’t done by the coffee companies themselves,” says Stemann. “There are specialist decaffeination companies that carry it out.” Many of these companies are based in Europe, Canada, the US and South America.&projectId={{project}}&language=en&numberOfTerms=0&nerParameters%5B0%5D.method=MAXIMUM_ENTROPY&nerParameters%5B0%5D.type=person&nerParameters%5B1%5D.method=MAXIMUM_ENTROPY&nerParameters%5B1%5D.type=organization&nerParameters%5B2%5D.method=MAXIMUM_ENTROPY&nerParameters%5B2%5D.type=location&numberOfConcepts=0

In the result we see how persons, organisations and locations are being detected:

Click to expand the result:

{
    "namedEntities": [
        {
            "textValue": "Chris Stemman",
            "type": "person",
            "frequency": 1,
            "score": 100,
            "method": "MAXIMUM_ENTROPY",
            "positions": [
                {
                    "beginningIndex": 0,
                    "endIndex": 12
                }
            ]
        },
        {
            "textValue": "British Coffee Association",
            "type": "organization",
            "frequency": 1,
            "score": 85,
            "method": "MAXIMUM_ENTROPY",
            "positions": [
                {
                    "beginningIndex": 45,
                    "endIndex": 70
                }
            ]
        },
        {
            "textValue": "Europe",
            "type": "location",
            "frequency": 1,
            "score": 12,
            "method": "MAXIMUM_ENTROPY",
            "positions": [
                {
                    "beginningIndex": 395,
                    "endIndex": 400
                }
            ]
        },
        {
            "textValue": "Canada",
            "type": "location",
            "frequency": 1,
            "score": 11,
            "method": "MAXIMUM_ENTROPY",
            "positions": [
                {
                    "beginningIndex": 403,
                    "endIndex": 408
                }
            ]
        },
        {
            "textValue": "US",
            "type": "organization",
            "frequency": 1,
            "score": 9,
            "method": "MAXIMUM_ENTROPY",
            "positions": [
                {
                    "beginningIndex": 415,
                    "endIndex": 416
                }
            ]
        },
        {
            "textValue": "South America",
            "type": "location",
            "frequency": 1,
            "score": 8,
            "method": "MAXIMUM_ENTROPY",
            "positions": [
                {
                    "beginningIndex": 422,
                    "endIndex": 434
                }
            ]
        }
    ]
}