Command Line Interface

pixolution flow can be used standalone command line tool to generate image descriptors. The delivered pixolution_flow3.4.1_solrX.X.X.jar provides a command line interface (CLI) for generating image descriptors outside of Solr. You can either process one image at a time (good for testing) or processing images in bulks (good for performance).

To get a complete documentation of the CLI, run the following command:

java -cp pixolution_flow3.4.1_solrX.X.X.jar de.pixolution.utils.CommandLineInterface

Processing a single image

You can calculate an image descriptor like shown in this example:

//process a single image and print its image descriptor to standard out
java -cp jblas-1.2.4.jar:pixolution_flow3.4.1_solrX.X.X.jar de.pixolution.utils.CommandLineInterface image=http://test.com/image.jpg

Supported CLI input is an URL with protocol HTTP or FILE. The given URL must be properly encoded (see URL Encoding).

The CLI will print the input filepath and comma separated the image descriptor as Base64 string to standard out with a newline and exits with status code 0.

If an error occurs CLI will output error messages to standard error and exits with status code 1. Remove the newline char when parsing standard out and storing the image descriptor, otherwise pixolution flow will throw an error if you try to index the image descriptor with a newline character.

Bulk processing

Since starting a Java Virtual Machine instance is quite expensive when used for just one image, the CLI supports bulk processing and multithreading. You can calculate image descriptors in bulk by setting a file of image URLs/paths as imagelist argument. This image list must contain one image URL per line. The CLI will then process each line, calculates the image descriptor and prints it to standard out in the form image URL,image descriptor.

Processing this image list can be parallelized by setting a threads parameter. If an image could not be processed the error message will be printed to error out and the CLI aborts processing. If you wish to continue processing even in error cases you can set the parameter continueonerror=true.

Generate image descriptors in bulk using with different arguments:

//process a image list and prints each image descriptor to standard out
java -cp jblas-1.2.4.jar:pixolution_flow3.4.1_solrX.X.X.jar de.pixolution.utils.CommandLineInterface imagelist=/home/user/imagelist.txt
//same as above but utilizing 8 threads to speed up processing
java -cp jblas-1.2.4.jar:pixolution_flow3.4.1_solrX.X.X.jar de.pixolution.utils.CommandLineInterface imagelist=/home/user/imagelist.txt threads=8
//same as above but do not abort processing if one image could not be processed
java -cp jblas-1.2.4.jar:pixolution_flow3.4.1_solrX.X.X.jar de.pixolution.utils.CommandLineInterface imagelist=/home/user/imagelist.txt continueonerror=true threads=8

Performance hint

For best performance we strongly recommend to also add the math library jblas-1.2.4.jar to the classpath. This implementation is often up to 10 times faster than the pure java implementation that ships with pixolution flow. You can download jBlas here. For more information refer to Native math backends for faster indexing

In order to have more control over the output you can redirect the console output to different destinations. With the 1> operator you can set the standard out channel and with 2> you can set the standard error channel.

The following example writes all errors to file errors.txt and all sucessfully calculated image descriptors to another file descriptors.csv:

java -cp jblas-1.2.4.jar:pixolution_flow3.4.1_solrX.X.X.jar de.pixolution.utils.CommandLineInterface imagelist=imagelist.txt 1> descriptors.csv 2> errors.txt