Add images

All information in Solr is stored in documents. Documents consist of fields which contain the data. pixolution flow stores the visual information extracted from images in automatically created fields.

For general information on how to add documents to Solr refer to the official Solr Reference Guide.

To add images to Solr you just have to insert documents with the referenced image in the field pxl_imagedescriptor. You can also add images to exisiting documents by inserting the pxl_imagedescriptor field with the referenced image into the document. The pxl_imagedescriptor field value contains either a URL referencing the image that should be indexed, or a precalculated image descriptor (see standalone image descriptor generation). Images referenced via URL must be accessible via HTTP, HTTPS or FILE protocol. Other protocols like FTP will result in an error.

The following example shows how to add three images in bulk, assuming the URLs point to valid image ressources. Each is referenced in a different manner.

Put the following JSON into a file named add.json in the current working directory:

[
  {
    "id": "1",
    "keywords":["internet","image"],
    "pxl_imagedescriptor":"http://localhost/image1.jpg"
  },
  {
    "id": "2",
    "keywords":["local","image"],
    "pxl_imagedescriptor":"file:///home/myuser/image2.jpg"
  },
  {
    "id": "3",
    "keywords":["precalculated","descriptor"],
    "pxl_imagedescriptor":"gCCB5vbIKTQd0dwINBkGDOUyAB7/CScf+PMH/fwfHPzv4vIWFwUJFg70AO39CP3qCAcEAvD8/ P0A8gbqBPj8BYAA2v2i37aHg6KUzZSTjYKBprOogBDuBw0B6PMo4wIa6fr3+vaVo9XE85Owmp3JlpyujJuwioAxAECAQBAmaTIGRQJQAER3WVxjDHM="
  }
]

You can then send this JSON payload to the /update request handler of a Solr collection reachable at localhost/solr/collection with this curl command:

curl "http://localhost/solr/collection/update/json?commit=true" -d @add.json

pixolution flow loads the referenced image, analyzes it and stores the calculated image descriptor in the Solr index. In fact, the given URL will be replaced by the string representation of the calculated image descriptor. Therefore, if you want to access the image URL in later searches, please make sure to add the image URL in an additional field.

You can still add documents without specifying a pxl_imagedescriptor field. Those documents will then have an empty pxl_imagedescriptor field and will not be searchable via visual searches, duplicate detection and so on. If you intend to leave lots of documents without an image descriptor you might also be interested in Find documents with or without image descriptor and Reduce log output.

Fault tolerant update behaviour

pixolution flow depends on the accessability of foreign resources when loading images. If a server is not reachable or the image is not accessible or corrupt an error will be thrown. Configure ImageUpdateFactory describes how to configure a fault tolerant update behaviour.