Query Example

IDs will be sent via HTTP POST method, within the POST body. POST requests have no restrictions on data length.

The UniversalConnectorHandler detects content-streams based on the Content-Type that is defined in the header field of the HTTP POST request. The Content-Type defines the MIME type of the body of the POST request. Only streams with Content-Type: text/csv will be parsed by UniversalConnectorHandler. Requests with a different Content-Type will be ignored and the POST body will not be processed.

The ID set is delivered as Comma-separated ID values (CSV). UniversalConnectorHandler parses the complete POST body. Therefore it is not allowed to add additional data to the POST body as it will be interpreted as CSV IDs. Unknown and duplicate IDs will be ignored.

The example below shows how to send HTTP POST request via console tool curl. The Content-Type is defined as text/csv. The POST body only consists of a set of comma separated IDs 1,2,3,4,5,6,. These IDs are unique key IDs and exist in the index. Send IDs as CSV via POST body to Solr in order to restrict results to the given IDs:

curl -v -H "Content-Type:text/csv" --data "1,2,3,4,5,6," "http://localhost:8080/solr/collection1/subset?q=*:*&rows=10&wt=json&indent=true&fl=id"
> POST /solr/collection1/subset?q=*:*&rows=10&wt=json&indent=true&fl=id HTTP/1.1
> User-Agent: curl/7.35.0
> Host: localhost:8080
> Accept: */*
> Content-Type:text/csv
> Content-Length: 12
>

Trailing comma in CSV set

Note, that there is a trailing comma in the given example ID set ( 1,2,3,4,5,6, ) in Listing below. The last ID will only be parsed if the set ends with a new line or a comma. For simplicity we recommend to just append a trailing comma to your CSV set.

< HTTP/1.1 200 OK
< Content-Type: application/json; charset=UTF-8
< Transfer-Encoding: chunked
< Server: Jetty(8.1.2.v20120308)
<
{
  "responseHeader":{
    "status":0,
    "QTime":11,
    "params":{
      "fl":"id",
      "indent":"true",
      "q":"*:*",
      "wt":"json",
      "rows":"10"}},
  "response":{"numFound":6,"start":0,"docs":[
      {
        "id":"1"},
      {
        "id":"2"},
      {
        "id":"3"},
      {
        "id":"4"},
      {
        "id":"5"},
      {
        "id":"6"}]
  }}

The example sends the data to the request handler /subset. The specific handler implementation is the UniversalConnectorHandler that parses the incoming POST body (see Configuration for setup instructions).

With the param q=*:* the query matches all documents stored in the Solr index. Since we want to restrict the search results to the given set of IDs, q=*:* means it matches all documents defined in the CSV ID set. Therefore only those six IDs are part of the response ("numFound":6).

All standard Solr query parameters (like q=*:*) are part of the URL as usual. You can trigger the complete range of pixolution flow functionaltiy (described in Visual Search) by just adding the needed parameter to the URL.

Note, if you do a similarity search by using rank.by=id:8 and you want the ID to be part of the response, you have to add this ID to the set of CSV IDs in the POST body.

How many IDs should I send?

There is no limitation on how many IDs you can send to the UniversalConnectorHandler. However, there is a trade-off between performance and search quality. Sending more image IDs deliver better results at the cost of higher query times (due to increased parsing efforts and internal docID lookups). It depends on your specific search system and your cache configuration (see section Configuration) how many IDs are feasable.

Set correct Content-type

Make sure to set the proper Content-Type. The default Content-Type: application* will be handled by Solr and will result in an error, since no content-streams are allowed in queries. Only Content-Type: text-csv will be parsed and processed by pixolution flow.