Skip to main content

Web Service Method: Extract Metadata from Inside Zip File Asynchronously

Abstract

Web Service Method: Extract Metadata from Inside Zip File Asynchronously

Description

[file] Extracts asynchronously and returns a list of documents with meaningful metadata like concepts and terms from documents which are packed inside a given archive file (*.zip) upload.

URL: /extractor/api/extract/zip/async

Request

Supported Methods

POST

Content-Type

multipart/form-data

HTTP Parameter

Parameter

Type

Required

Description

categorizationWithPpxBoost

boolean

false

Use Extractor boosting, default = false

categorize

boolean

false

Categorization extraction, default = false

conceptSchemeFilters

Array of String

false

Concept scheme URI filters

corpusScoring

Array of String

false

Corpus term scoring. Enabled if corpusIds (UUID) are provided

customAttributeFilters

Array of CustomProperty

false

Custom attribute (property uri and string value) filters

customClassFilters

Array of String

false

Custom class URI filters

disambiguate

boolean

false

Use thesaurus based disambiguation, default = false

displayText

boolean

false

Include text extracted from url in response, default = false

documentClassifierIds

Array of String

false

Enable document classification by giving the document classifier IDs as input

documentId

String

false

Internal ID of the document

file

MultipartFile

true

File to be extracted (word, excel, powerpoint, pdf, open documents) - Mimetype of file must be 'multipart/form-data'

filterNestedConcepts

boolean

false

Remove concepts matches which are contained within other matches, default = false

findPersonNames

boolean

false

Person name extraction, default = false

language

String

false

Extraction language (en|de|es|fr|...)

lemmatization

boolean

false

Use lemmatization, default = true

locationExtraction

boolean

false

Location extraction, default = false

metadata

String

false

Metadata of the document (concatenated fields with delimiter: '.')

numberOfConcepts

Integer

false

Retrieve number of concepts, default = 25

numberOfTerms

Integer

false

Retrieve number of terms, default = 25

projectId

Array of String

false

Thesaurus projectIds

properties

Array of String

false

Array of custom class attributes and relations that will be fetched by providing their property URIs as input. Furthermore it supports http://www.w3.org/1999/02/22-rdf-syntax-ns#type.

Set to all to fetch all properties.

regexFilename

String

false

File name for regex patterns

sentimentAnalysis

boolean

false

Sentiment analysis, default: false

shadowConceptCorpusId

Array of String

false

Shadow concepts calculation. Enabled if corpusIds (UUID) are provided

showMatchingDetails

boolean

false

Shows which concept labels where found inside the text, default = false

showMatchingPosition

boolean

false

Shows the position of the matched text. Only shown if showMatchingDetails = true. default = false

tfidfScoring

boolean

false

Use TFIDF scoring

useRelatedConcepts

boolean

false

Retrieve related concepts, default = false

useTransitiveBroaderConcepts

boolean

false

Retrieve transitive broader concepts, default = false

useTransitiveBroaderTopConcepts

boolean

false

Retrieve transitive broader top concepts, default = false

useTypes

boolean

false

Retrieve custom types for concepts, default = false

Response

This method returns execution results in JSON format.

Return ValuesTaskSubmitResponse

common base response defining the minimum result structure and semantics.

Attribute

Type

Comment

message

String

short descriptive message of the operation result, or an error description

result

Object

the actual response content body, defined by the resultType.

resultType

String

MIME type of the result if successful, or Exception type if an error occurred

status

int

HTTP status code of the requested operation

success

boolean

true if the operation was successful (i.e. returning a status of 2xx)

taskId

String