Similarity Services
Similarity Services
The PoolParty Similarity Service provides an API to query the engine for similar documents given a document ID and receive response in different formats.
Similarity Request
Request Parameters
Parameter  | Definition  | 
|---|---|
id *  | The unique identifier of the document to retrieve similar documents for.  | 
fields *  | A list of fields used for calculating the document similarity.  | 
locale  | The locale of the client, so labels can be resolved to the correct language. If not set, the service will try to retrieve the locale from the   | 
start  | The index (offset) of the search results returned. (Default: 0)  | 
count  | The number of results that should be returned starting at parameter   | 
format  | We recommend to use  Other acceptable values are: 
  | 
view  | Name of a JSP view that is available in the  This allows to format the result in any way you want. Overrides the format of standard serialization!  | 
customAttributes  | A list of field names that are defined in the fieldConfig. By setting this parameters you receive content of fields that are not defined in the docFieldConfig.  | 
* mandatory parameters
GET
When performing a search query via HTTP GET, the parameters have to be included in the URL and must be URL-encoded!
Note
Since there can be multiple fields in a similarity search request, they have to be marked as an array by appending square brackets and the array index!
A simple search request could look this this:
http://<host>:<port>/search/api/similar?count=10&id=100&fields[0]=title&fields[1]=description&format=json
Which would result in a query with count=10, document id=100 and the fields title and description for similarity calculation. The result will be rendered in the JSON format.
POST
Another way of performing a search query is via HTTP POST. Using POST the request has to be serialized in JSON format and included in the POST body of the request.
Also, the Content-Type header of the request has to be set to application/json so that the server can process the request correctly.
Note
XML is currently not supported as request format, but will follow in a future version.
Similarity Response
Response Elements
Element  | Description  | 
|---|---|
request  | The entire similarity request responsible for this response is echoed for convenience.  | 
results  | A list of documents that have been found. Each document has the standard fields: 
  | 
success  | 
  | 
message  | If  If the request was successful, usually the query time is included in the message.  | 
total  | The total number of documents found in the index. Note, this is different from the number of documents actually returned in the   | 
Response Formats
JSON
{
   "request":
   {
      "sort":null,
      "fields":
         [
            "title",
            "description"
         ],
      "id":"100",
      "start":0,
      "callback":null,
      "encoding":null,
      "locale":"en",
      "format":"json",
      "count":10,
      "view":null
   },
   "results": [
      {
         "id":"101",
         "title":"A Similar Document",
         "description":"A Similar Document",
         "link":"relative/path/to/similarfile.pdf",
         "customAttributes": {
            "FileSize":162188,
            "score":1.8334198,
            "date":1208893949000
         }
       },
       ...
    ],
    "success":true,
    "message":null,
    "total":17306
}