URL parsing more robust and uniform
When adding documents or when
rank.by URL queries are done, the processing of those URLs is improved.
File and folder names of valid URLs may consist of reserved, unreserved and other characters (for further information read Percent-encoding and RCF3986).
Older versions may fail if file or folder names of an URL contains reserved or other chars (only unreserved chars were safe). We now implemented a much more robust and uniform parsing module that can cope with various types of nasty URLs. For example a URL can look like this:
http://localhost:1234/a folder/=ä@>d0?! (pk.
as long as the chars are properly percent-encoded (Note that the / is not encoded, since it indicates a folder hierarchy and has therefore special meaning):
When percent-encoding characters of URLs, use UTF-8 since pixolution flow decodes the URL with UTF-8 internally. Please refer to the API documentation for further information on URL handling.
Exceptions more meaningful
If in the
ImageUpdateFactory configuration fields are referenced that don’t exist, pixolution flow now returns more meaningful error messages.#
Earlier versions answered
feature.fieldname configuration errors of
ImageUpdateFactory with misleading messages like
Error adding field 'feature'='http://example.org/img/13495ebb6906a02336a4cx.jpg' msg=String length must be a multiple of four.
The new version answers with a proper error message like
Field with name "feature " does not exists. Please check if the property "processor.feature.fieldname" of "ImageUpdateFactory" in your updateRequestProcessorChain is properly set in your solrconfig.xml.
We also improved the error messages when images could not be loaded and do now returns the HTTP status code. Additionally exceptions are more meaningful when processing invalid URLs.
Changed URL handling
URLs given in
rank.by queries or when indexing new documents must contain a valid protocol (
URLs referencing a resource via file must be absolute paths.
Note that there is a potential security risk due to information exposure through an error message.
A user may test whether local files exist or not via
rank.by queries with
FILE as protocol by analyzing the returned error message.
This may either reveal sensitive information which may be used for a later attack or private information stored in the server.
Make sure Solr has restricted access to the filesystem or is not queryable with user defined requests via Internet.