Skip to content

Production deployment

Using Flow with Docker orchestration requires a commercial Flow license.

To use Flow in production with Kubernetes or Docker Swarm you need a commercial license.

We offer a Docker base image called pixolution/flow-hub, which is designed to be used with our Flow software. This image is built using the latest official Solr image that we support. It comes with preconfiguration settings and includes third-party libraries that are necessary for Flow.

It's important to note that the pixolution/flow-hub image does not include any Flow libraries. Instead, it is meant to be used alongside the zip archive that we provide when you upgrade to the professional version.

Before using our Docker image and Flow binaries, it's essential to ensure compatibility. Our pixolution/flow-hub base image requires Flow binaries that are compatible with it. For example, if you are using the Docker image with the tag pixolution/flow-hub:5.0.2-9.4, you need to make sure you have the corresponding Flow binaries with the exact same versions, such as pixolution-flow-5.0.2-solr-9.4.zip.

Unzip the pixolution-flow-5.0.2-solr-9.4.zip file:

mkdir flow5
cd flow5
unzip pixolution-flow-5.0.2-solr-9.4.zip

We are only interested in the Flow binaries that can be found in the flow-jars folder.

All needed JAR files.
.
[...]
├── flow-jars
│   ├── pixolution-flow-5.0.2-solr-9.4.jar
│   ├── pixolution-module-color-5.0.2.jar
│   ├── pixolution-module-content-5.0.2.jar
│   ├── pixolution-module-copy-space-filter-5.0.2.jar
│   └── pixolution-module-duplicate-5.0.2.jar
[...]

All of the deployment options described below require the Flow binaries to be included in the Docker base image.

Standalone Docker

Install a recent version of docker or a drop-in alternative like podman.

Create a named volume for the index data

docker volume create flow-index

Start the Flow image in background and mount the flow-jars folder with the required binaries.

docker run --rm -p 8983:8983 -v flow-index:/var/solr -v "$(pwd)/flow-jars:/pixolution" --name pixolution-flow -d pixolution/flow-hub:5.0.2-9.4

Inspect that the container is running

docker ps

Inspect Flow instance logs

docker logs -f $(docker ps --format '{{.ID}} {{.Names}}'| grep pixolution-flow | cut -d" " -f1)

Initialize Flow once and check if it can be loaded and configure its fields:

curl "http://localhost:8983/api/cores/my-collection/pixolution"

Stop the running container (shutdown)

docker stop $(docker ps --format '{{.ID}} {{.Names}}'| grep pixolution-flow | cut -d" " -f1)

Delete named volume with index data (wipe persistent data)

docker volume ls
docker volume rm flow-index

Docker Swarm

Install a recent version of docker.

Create a docker-compose.ymlconfiguration with the following content:

version: "3"

services:
  solr:
    image: pixolution/flow-hub:5.0.2-9.4
    deploy:
      mode: replicated
      replicas: 1
      resources:
        limits:
          cpus: '4.0'
          memory: 4G
        reservations:
          cpus: '0.25'
          memory: 512M
    volumes:
      - flow-index:/var/solr
      - ./flow-jars/:/pixolution
    ports:
      - "8983:8983"
    logging:
      driver: "json-file"
      options:
        mode: "non-blocking"
        tag: "{{.Name}}"
        max-size: "10M"
        max-file: "10"

volumes:
  flow-index:

Init a local Docker Swarm by running:

docker swarm init

Deploy the stack to the swarm

docker stack deploy -c docker-compose.yml flow-stack

Initialize Flow once and check if it can be loaded and configure its fields:

curl "http://localhost:8983/api/cores/my-collection/pixolution"

Inspect the stack details

docker stack ls
docker stack ps flow-stack

Inspect Flow instance logs

docker service ls
docker service logs -f flow-stack_solr

Delete stack (shutdown, volume is preserved)

docker stack rm flow-stack

Delete named volume with index data (wipe persistent data)

docker volume ls
docker volume rm flow-stack_flow-index

Docker Compose

Install docker or any other container environment that supports docker-compose definitions (e.g. podman and docker-compose). Also install the docker-compose scripts.

Create a docker-compose.ymlconfiguration with the following content:

version: "3"

services:
  solr:
    image: pixolution/flow-hub:5.0.2-9.4
    volumes:
      - flow-index:/var/solr
      - ./flow-jars/:/pixolution
    ports:
      - "8983:8983"
    logging:
      driver: "json-file"
      options:
        mode: "non-blocking"
        tag: "{{.Name}}"
        max-size: "10M"
        max-file: "10"

volumes:
  flow-index:

Start the ensemble

docker-compose -d up

Initialize Flow once and check if it can be loaded and configure its fields:

curl "http://localhost:8983/api/cores/my-collection/pixolution"

Inspect ensemble

docker-compose images

Inspect Flow instance logs

docker-compose logs -f

Remove ensemble (shutdown)

docker-compose down

Delete named volume with index data (wipe persistent data)

docker volume ls
docker volume rm flow-docker-examples_flow-index

Kubernetes

A deployment into Kubernetes is more complex because to bind-mount folders contradicts the philosophy of Kubernetes. The easiest way is to build a new docker image that include all needed jars. This is the most flexible and robust way. To provide the customized image to your Kubernetes you need a private docker registry that Kubernetes is able to access.

The following guide uses the Solr Operator Helm Chart to start a Solr cloud collection with Flow.

Make sure that the following tools are installed:

Create a file named Dockerfile with the following content:

ARG image

FROM ${image}

USER root

# copy the module-jars and pixolution-flow jar
RUN find flow-jars/ -name "*.jar" -exec cp "{}" /opt/solr/server/solr-webapp/webapp/WEB-INF/lib/ \;

USER solr

The first step is to build a new image that include all needed jars (note the . at the end)

docker build --build-arg image="pixolution/flow-hub:5.0.2-9.4" -t registry.your-domain.com/customized-flow-docker:5.0.2-9.4 .

Allow docker access to your private registry:

docker login registry.your-domain.com

Push the newly built image registry.your-domain.com/customized-flow-docker:5.0.2-9.4 to your private registry

docker push registry.your-domain.com/customized-flow-docker:5.0.2-9.4

To deploy a SolrCloud to Kubernetes we use the solr-operator helm chart.

helm repo add apache-solr https://solr.apache.org/charts
kubectl create -f https://solr.apache.org/operator/downloads/crds/v0.7.0/all-with-dependencies.yaml
helm install solr-operator apache-solr/solr-operator --version 0.7.0

Grant Kubernetes access to your private registry:

kubectl create secret docker-registry regcred-flow --docker-server=registry.your-domain.com --docker-username=<your-name> --docker-password=<your-pword> --docker-email=<your-email

Create a SolrCloud configuration, save it as flow-cloud-definition.yaml. Below is a minimal example, see the official documentation for the helm chart and the solr-cloud-crd documentation for all available options

apiVersion: solr.apache.org/v1beta1
kind: SolrCloud
metadata:
  name: flow-cloud
spec:
  dataStorage:
    persistent:
      pvcTemplate:
        spec:
          resources:
            requests:
              storage: 10Gi
      reclaimPolicy: Delete
  replicas: 3
  solrImage:
    repository: registry.your-domain.com/customized-flow-docker
    tag: 5.0.2-9.4
    imagePullSecret: regcred-flow
  solrJavaMem: -Xms500M -Xmx5000M
  updateStrategy:
    method: StatefulSet
  zookeeperRef:
    provided:
      image:
        pullPolicy: IfNotPresent
        repository: pravega/zookeeper
        tag: 0.2.13
      persistence:
        reclaimPolicy: Delete
        spec:
          accessModes:
          - ReadWriteOnce
          resources:
            requests:
              storage: 10Gi
      replicas: 1

Deploy the SolrCloud definition to your Kubernetes cluster

kubectl apply -f flow-cloud-definition.yaml