Web Service Method: Annotate from URL
Web Service Method: Annotate from URL
Description |
---|
[url] Returns the document annotated with extracted concepts and extracted terms in RDF/XML representation. |
URL: /extractor/api/annotate
Supported Methods |
---|
POST |
GET |
application/x-www-form-urlencoded
Parameter | Type | Required | Description |
---|---|---|---|
categorizationWithPpxBoost | boolean | false | Use Extractor boosting, default = false |
categorize | boolean | false | Categorization extraction, default = false |
conceptMinimumScore | Double | false | Minimum required score of concepts, default = 0 |
conceptSchemeFilters | Array of String | false | Concept scheme URI filters |
corpusScoring | Array of String | false | Corpus term scoring. Enabled if corpusIds (UUID) are provided. |
customAttributeFilters | Array of CustomProperty | false | Custom attribute (property uri and string value) filters |
customClassFilters | Array of String | false | Custom class URI filters |
disambiguate | boolean | false | Use thesaurus based disambiguation, default = false |
displayText | boolean | false | Include text extracted from url in response, default = false |
documentClassifierIds | Array of String | false | Enable document classification by giving the document classifier IDs as input |
documentId | String | false | Internal ID of the document, taken from documentUri |
documentUri | String | true | URI of annotated document, used as ID |
extractorVersion | String | false | Version of PPX Extractor used |
filterNestedConcepts | boolean | false | Remove concepts matches which are contained within other matches, default = true |
findPersonNames | boolean | false | Deprecated (use nerParameters) - extracts person names from the given text |
language | String | false | Extraction language (en|de|es|fr|...) |
lemmatization | boolean | false | Use lemmatization, default = true |
locationExtraction | boolean | false | Deprecated (use nerParameters) - extracts locations from the given text |
nerParameters | Array of NERConfig | false | Array of models that are used for Named Entity Recognition |
numberOfConcepts | Integer | false | Retrieve number of concepts, default = 25 |
numberOfTerms | Integer | false | Retrieve number of terms, default = 25 |
phraseLength | Integer | false | Phrase length, default = 4 |
projectId | Array of String | false | Thesaurus projectIds |
properties | Array of String | false | Array of custom class attributes and relations that will be fetched by providing their property URIs as input. Furthermore it supports http://www.w3.org/1999/02/22-rdf-syntax-ns#type. Set to all to fetch all properties. |
regexFilename | String | false | File name for regex patterns |
resultFilterSparql | String | false | Specify an optional SPARQL query for filtering the RDF result |
sentimentAnalysis | boolean | false | Sentiment analysis, default: false |
shadowConceptCorpusId | Array of String | false | Shadow concepts calculation. Enabled if corpusIds (UUID) are provided. |
showMatchingDetails | boolean | false | Shows which concept labels where found inside the text, default = false |
showMatchingPosition | boolean | false | Shows the position of the matched text. Only shown if showMatchingDetails = true. default = false |
tfidfScoring | boolean | false | Use TFIDF scoring, default = false |
title | String | false | Title of the document |
url | String | true | URL of a web document to be annotated |
useRelatedConcepts | boolean | false | Retrieve related concepts, default = false |
useTransitiveBroaderConcepts | boolean | false | Retrieve transitive broader concepts, default = false |
useTransitiveBroaderTopConcepts | boolean | false | Retrieve transitive broader top concepts, default = false |
useTypes | boolean | false | Retrieve custom types for concepts, default = false |
Attribute | Type | Required | Comment |
---|---|---|---|
property | String | false | Property |
value | String | false | Value |
{ "property" : "https://semantic-web.com/api/property#6376", "value" : "some value" }
Named Entity Recognition Configuration
Attribute | Type | Required | Comment |
---|---|---|---|
method | Method | false | Method used for Named Entity Extraction. (default: MAXIMUM_ENTROPY) RULE_BASED | MAXIMUM_ENTROPY |
type | String | false | Type of Named Entity Model. Pre-defined models for MAXIMUM_ENTROPY: person, organization, location |
{ "method" : "RULE_BASED", "type" : "https://semantic-web.com/api/type#20383" }
An ObjectStreamField object.
Attribute | Type | Required | Comment |
---|---|---|---|
field | Field | false | |
name | String | false | |
offset | int | false | |
signature | String | false | |
type | Class | false | |
unshared | boolean | false |
Click here to expand...
{ "field" : { "genericInfo" : { "factory" : null, "tree" : null, "genericType" : null }, "declaredAnnotations" : { }, "overrideFieldAccessor" : { }, "signature" : "some signature", "annotations" : [ 23 ], "securityCheckCache" : { }, "slot" : 4350, "fieldAccessor" : { }, "modifiers" : 27639, "type" : { "annotationData" : null, "genericInfo" : null, "ENUM" : 2479, "enumConstantDirectory" : { }, "classRedefinedCount" : 17528, "initted" : false, "cachedConstructor" : null, "useCaches" : true, "SYNTHETIC" : 28542, "annotationType" : null, "newInstanceCallerCache" : null, "reflectionData" : null, "classValueMap" : { }, "serialPersistentFields" : [ null, null, null ], "serialVersionUID" : 19423, "ANNOTATION" : 2206, "enumConstants" : [ null, null ], "name" : "some name", "reflectionFactory" : null, "allPermDomain" : null }, "ACCESS_PERMISSION" : { "serialVersionUID" : 23155, "name" : "some name" }, "root" : { "genericInfo" : null, "declaredAnnotations" : { }, "overrideFieldAccessor" : null, "signature" : "some signature", "annotations" : [ 87, 51 ], "securityCheckCache" : null, "slot" : 18207, "fieldAccessor" : null, "modifiers" : 24703, "type" : null, "ACCESS_PERMISSION" : null, "root" : null, "name" : "some name", "override" : true, "reflectionFactory" : null, "clazz" : null }, "name" : "some name", "override" : true, "reflectionFactory" : { "inflationThreshold" : 28477, "initted" : false, "soleInstance" : null, "reflectionFactoryAccessPerm" : null, "langReflectAccess" : null, "noInflation" : false }, "clazz" : { "annotationData" : null, "genericInfo" : null, "ENUM" : 30581, "enumConstantDirectory" : { }, "classRedefinedCount" : 12111, "initted" : false, "cachedConstructor" : null, "useCaches" : true, "SYNTHETIC" : 27304, "annotationType" : null, "newInstanceCallerCache" : null, "reflectionData" : null, "classValueMap" : { }, "serialPersistentFields" : [ null, null ], "serialVersionUID" : 24089, "ANNOTATION" : 3326, "enumConstants" : [ null, null ], "name" : "some name", "reflectionFactory" : null, "allPermDomain" : null } }, "offset" : 11522, "signature" : "some signature", "unshared" : false, "name" : "some name", "type" : { "annotationData" : { "declaredAnnotations" : { }, "redefinedCount" : 11463, "annotations" : { } }, "genericInfo" : { "factory" : null, "superclass" : null, "tree" : null, "typeParams" : [ null, null, null ], "NONE" : null, "superInterfaces" : [ null, null ] }, "ENUM" : 2206, "enumConstantDirectory" : { }, "classRedefinedCount" : 8783, "initted" : false, "cachedConstructor" : { "genericInfo" : null, "declaredAnnotations" : { }, "hasRealParameterData" : false, "parameterTypes" : [ null, null ], "signature" : "some signature", "annotations" : [ 30 ], "securityCheckCache" : null, "constructorAccessor" : null, "slot" : 25006, "modifiers" : 3408, "ACCESS_PERMISSION" : null, "exceptionTypes" : [ null ], "root" : null, "override" : false, "parameterAnnotations" : [ 71, 121 ], "reflectionFactory" : null, "clazz" : null, "parameters" : [ null ] }, "useCaches" : true, "SYNTHETIC" : 7276, "annotationType" : { "inherited" : true, "members" : { }, "memberDefaults" : { }, "$assertionsDisabled" : false, "memberTypes" : { }, "retention" : "RUNTIME" }, "newInstanceCallerCache" : { "annotationData" : null, "genericInfo" : null, "ENUM" : 30429, "enumConstantDirectory" : { }, "classRedefinedCount" : 13473, "initted" : true, "cachedConstructor" : null, "useCaches" : true, "SYNTHETIC" : 5278, "annotationType" : null, "newInstanceCallerCache" : null, "reflectionData" : null, "classValueMap" : { }, "serialPersistentFields" : [ null, null, null ], "serialVersionUID" : 18766, "ANNOTATION" : 3482, "enumConstants" : [ null ], "name" : "some name", "reflectionFactory" : null, "allPermDomain" : null }, "reflectionData" : { "next" : null, "discovered" : null, "referent" : null, "pending" : null, "lock" : null, "clock" : 1663, "queue" : null, "timestamp" : 29342 }, "classValueMap" : { }, "serialPersistentFields" : [ { "field" : null, "offset" : 20136, "signature" : "some signature", "unshared" : true, "name" : "some name", "type" : null } ], "serialVersionUID" : 7837, "ANNOTATION" : 12014, "enumConstants" : [ { }, { } ], "name" : "some name", "reflectionFactory" : { "inflationThreshold" : 192, "initted" : false, "soleInstance" : null, "reflectionFactoryAccessPerm" : null, "langReflectAccess" : null, "noInflation" : false }, "allPermDomain" : { "staticPermissions" : false, "debug" : null, "hasAllPerm" : true, "codesource" : null, "permissions" : null, "classloader" : null, "principals" : [ null, null ], "key" : null } } }
text/plain
Status: 200 - OK
This method returns execution results in format application/rdf+xml
You can now manipulate the response format to any RDF format, as also defined here: http://docs.rdf4j.org/javadoc/2.3/org/eclipse/rdf4j/rio/RDFFormat.html
application/rdf+xml
application/n-triples
application/x-turtle
application/trix
application/trig
In order to configure the response format, use an additional Accept header in your call.
Using an HTTP REST client, such as Postman, the call would look like this, according to the format you need to be returned: