Using Apache Hop with CrateDB

Apache Hop is an open-source data integration platform which started as a fork of Kettle (Pentaho Data Integration).

We can get started quickly by deploying it with Docker and connecting it to CrateDB:

mkdir jdbcdrivers
wget https://jdbc.postgresql.org/download/postgresql-42.7.1.jar
mv ./postgresql-42.7.1.jar jdbcdrivers/.
sudo docker run -d --network=host --name apache_hop -v $(pwd)/jdbcdrivers:/files/jdbc --env HOP_SHARED_JDBC_FOLDERS=/files/jdbc apache/hop-web:latest

Then run this on your CrateDB instance:

CREATE TABLE pg_user (usesysid TEXT, usename TEXT);

Now browse to http://localhost:8080/ui and create a new “Relational Database Connection” with “Connection type” “PostgreSQL” using the coordinates of your CrateDB server and “doc” as “Database name”.

Fill in the details of your CrateDB database and test to test the connection and explore to start.

Use the “Table input” and “Table output” actions in your pipelines:

1 Like