Command Line Interface
pixolution flow can be used standalone command line tool to generate image descriptors.
pixolution_flow3.4.1_solrX.X.X.jar provides a command line interface (CLI) for generating image descriptors outside of Solr.
You can either process one image at a time (good for testing) or processing images in bulks (good for performance).
To get a complete documentation of the CLI, run the following command:
java -cp pixolution_flow3.4.1_solrX.X.X.jar de.pixolution.utils.CommandLineInterface
Processing a single image
You can calculate an image descriptor like shown in this example:
//process a single image and print its image descriptor to standard out java -cp jblas-1.2.4.jar:pixolution_flow3.4.1_solrX.X.X.jar de.pixolution.utils.CommandLineInterface image=http://test.com/image.jpg
Supported CLI input is an
URL with protocol
The given URL must be properly encoded (see URL Encoding).
The CLI will print the input filepath and comma separated the image descriptor as Base64 string to standard out with a newline and exits with status code 0.
If an error occurs CLI will output error messages to standard error and exits with status code 1. Remove the newline char when parsing standard out and storing the image descriptor, otherwise pixolution flow will throw an error if you try to index the image descriptor with a newline character.
Since starting a Java Virtual Machine instance is quite expensive when used for just one image, the CLI supports bulk processing and multithreading.
You can calculate image descriptors in bulk by setting a file of image URLs/paths as
This image list must contain one image URL per line.
The CLI will then process each line, calculates the image descriptor and prints it to standard out in the form
image URL,image descriptor.
Processing this image list can be parallelized by setting a
If an image could not be processed the error message will be printed to error out and the CLI aborts processing.
If you wish to continue processing even in error cases you can set the parameter
Generate image descriptors in bulk using with different arguments:
//process a image list and prints each image descriptor to standard out java -cp jblas-1.2.4.jar:pixolution_flow3.4.1_solrX.X.X.jar de.pixolution.utils.CommandLineInterface imagelist=/home/user/imagelist.txt //same as above but utilizing 8 threads to speed up processing java -cp jblas-1.2.4.jar:pixolution_flow3.4.1_solrX.X.X.jar de.pixolution.utils.CommandLineInterface imagelist=/home/user/imagelist.txt threads=8 //same as above but do not abort processing if one image could not be processed java -cp jblas-1.2.4.jar:pixolution_flow3.4.1_solrX.X.X.jar de.pixolution.utils.CommandLineInterface imagelist=/home/user/imagelist.txt continueonerror=true threads=8
For best performance we strongly recommend to also add the math library jblas-1.2.4.jar to the classpath. This implementation is often up to 10 times faster than the pure java implementation that ships with pixolution flow. You can download jBlas here. For more information refer to Native math backends for faster indexing
In order to have more control over the output you can redirect the console output to different destinations.
1> operator you can set the standard out channel and with
2> you can set the standard error channel.
The following example writes all errors to file
errors.txt and all sucessfully calculated image descriptors to another file
java -cp jblas-1.2.4.jar:pixolution_flow3.4.1_solrX.X.X.jar de.pixolution.utils.CommandLineInterface imagelist=imagelist.txt 1> descriptors.csv 2> errors.txt