Dockerising NiFi with custom processors and preloaded workflow

I recently had to spin up a NiFi container in docker and wanted to create a default state with a template and workflow already present.

This post talks you through how to setup a Nifi docker container and automate the inclusion of a custom processor `.nar` file, an existing `template.xml` and a `workflow` in a running state.

The process is not too well documented by Apache so I hope someone finds this useful!

Containerisation

A recent project required our team to collaborate on a proof of concept solution which included a web interface, a NiFi workflow and an Elastic backend.

To avoid each developer having to install and build the tools from scratch we opted to `docker-compose` the whole solution. Docker compose allows us to create a series of Docker containers which can easily talk to each other.

Docker & NiFi

Apache has an official docker image for NiFi; you can use docker to run this image with a dockerfile like this.

FROM apache/nifi:latest
EXPOSE 8080

You can now run the image and access the NiFi Web interface on http://localhost:8080.

This is great except docker containers are designed to be ephemeral and we don’t want to have to manually add our template each time we run the container. So we need a way to add our templates and state to the container.

Adding a custom processor

To add a processor to NiFi you simply need to copy the `.nar` file into the correct location inside the container and run it.

Processors in NiFi live in `/opt/nifi/nifi-1.6.0/lib/` so you can use the docker `COPY` command to copy your processor in.

The `dockerfile` below assumes that the processor is in the same folder as your `dockerfile`

FROM apache/nifi:latest
COPY our_processor.nar /opt/nifi/nifi-1.6.0/lib/
EXPOSE 8080

Adding a template

Adding a template is the same as adding a processor but with different file types and locations. The template controls the processors in your workflow and joins everything together.

FROM apache/nifi:latest
COPY our_process.nar /opt/nifi/nifi-1.6.0/lib/
COPY our_workflow.xml /opt/nifi/nifi-1.6.0/conf/templates
EXPOSE 8080

If you now run your container you should be able to use the NiFi web interface to add an existing template manually, your template is copied into the container should be available to choose from the import dropdown.

We can use our processor as part of our template but we still have to use the web interface to get our container into a state where the NiFi workflow is running. We should automate that!

Automating the workflow

We can automate the process of importing a template through the web interface and starting it.

First start our container and manually get it to the state we want it to be when we start it later on.

docker build -t nifitest .
docker run -p 8080:8080 -d nifitest

You should now be able to load the web interface import the template we added earlier; we can then click `play` to get our workflow running.

Now it’s in a running state we need to copy that state out somehow, this is the bit which isn’t documented too well! The state of the NiFi template is stored in `flow.xml.gz` which lives in `/opt/nifi/nifi-1.6.0/conf/`.

To copy that file out of the container we can start an interactive `bash` by first listing the `container_id` and then running the following commands:

docker ps -a
docker exec -it <container_id> /bin/bash/

This will put us into the container, we can then run the following command to copy out the `flow.xml.gz`

docker cp <container_id>:/opt/nifi/nifi-1.6.0/conf/flow.xml.gz flow.xml.gz

We now have a copy of our NiFi state. Next we need to use that state when we run our container. To do this we can use `COPY` again to copy the file back in each time we start the container.

NOTE: COPY always copies as root so the user the container runs as (nifi:nifi) won’t have access to the file once we copy the file in. We can run a `–chown` at the same time to give ownership to the `nifi` user.

FROM apache/nifi:latest
COPY our_process.nar /opt/nifi/nifi-1.6.0/lib/
COPY our_workflow.xml /opt/nifi/nifi-1.6.0/conf/templates
COPY --chown=nifi:nifi flow.xml.gz /opt/nifi/nifi-1.6.0/conf/
EXPOSE 8080

Summary

We now have a container which has our custom processor as part of our template, our template available and the template loaded and the workflow started.

This makes it as simple as `docker run <image>` when another team member wants to run the NiFi instance.

No manual intervention is required, so we can package NiFi as a service in `docker-compose` and run the suite of containers simply and easily between users.

Leave a Comment