CrateDB connection issue while using JDBC Driver

CRATE connection issue when using JDBC

import jaydebeapi

print("LOADING.....")
conn = jaydebeapi.connect(
    "io.crate.client.jdbc.CrateDriver",
    "jdbc:crate://127.0.0.1:2000/",
    ["user", "password"],
    "/home/test/test/test/spark_demo/crate-jdbc-standalone-2.7.0.jar")

print("CONNECTED.....")
curs = conn.cursor()
curs.execute("SELECT * FROM doc.test LIMIT 5")
print("FETCHING.....")
result = curs.fetchall()
print(result)
curs.close()
conn.close()

I am trying to connect CrateDB using jdbc as part of implementing spark, but it is not establishing the connection, it is stuck after printing loading, there is no error as well, here I am unable to fetch or update something to DB.

Crate version 5.2.8

DB is running and accessible on loacl host on the given port, there is no network restrictions, when I tried on colleague’s laptop, getting the same

I would be grateful if anyone can help

1 Like

I’ve just tried jaydebeapi with the latest CrateDB 5.7.1 and 5.2.8 and it works as expected.
Also when using a wrong port or wrong credentials an error should be presented. Same when using a wrong path for the JAR.
Do you use the default config for CrateDB or any customization? If so please share that.

Hello, Thanks for the reply, I use the default configuration for CrateDB, port here I using is port forwarded to the Kubernetes cluster where crateDB hosts, running script and the .jar file is present in the same working directory, kindly let me know if anything else is required, this is the same when we tried from a different PC.

Hi, just an idea, maybe port 2000 is forwarded to 4200 instead of 5432?
Also when you say the port is forwarded “to the Kubernetes cluster”, what is typically used in this context is a resource of type LoadBalancer, is that the case in your setup? (the IP address would not be 127.0.0.1)

Hello Hernnc, Thanks for the response, below is how I connect CrateDB from localhost,

kubectl port-forward svc/ 2000:4200 -n crate

after running this command, Crate console is accessible on localhost on port 2000

I am just providing the code that I try to do with spark

from pyspark.sql import SparkSession

driver = "io.crate.client.jdbc.CrateDriver"
table = "doc.test"  
url = "jdbc:crate://127.0.0.1:2000/?user=crate&sslmode=require"
print("TABLE NAME: ", table)

spark = SparkSession.builder \
    .appName("CrateDBIntegration") \
    .config("spark.jars", "/path/to/driver/crate-jdbc-standalone-2.7.0.jar") \
    .config("spark.executor.extraJavaOptions", "-Djava.security.debug=all -Djavax.net.debug=all") \
    .config("spark.driver.extraJavaOptions", "-Djava.security.debug=all -Djavax.net.debug=all") \
    .getOrCreate()


spark.sparkContext.setLogLevel('DEBUG')

try:
    # Read data from CrateDB table
    print("****** FETCHING DATA ******\n")
    df = spark.read.format("jdbc").option("url", url).option("driver", driver) \
        .option("dbtable", table).load()

    # Show the DataFrame
    print("****** DF******\n")
    print(df.show(5))

except Exception as e:
    print("An error occurred:", e)
finally:
    spark.stop()

I run the file using the command

spark-submit   --jars ./crate-jdbc-standalone-2.7.0.jar   --driver-class-path ./crate-jdbc-standalone-2.7.0.jar   spark_crate.py

below is what I am getting. It prints the string Fetching data I provided, then stuck here.

24/05/15 11:51:09 DEBUG BasicConnectionProvider: JDBC connection initiated with URL: jdbc:crate://127.0.0.1:2000/?user=crate&sslmode=require and properties: {}

@Anandub

For JDBC one must use the PG protocol which CrateDB listen to by default at port 5432.
The Crate console crash is using the HTTP API to access CrateDB at port 4200 by default.

When using the wrong port 4200 (HTTP) using jaydebeapi I can reproduce your described behaviour, it will just hang. Not sure if there is much CrateDB can do on it’s side.
So changing your connection (and port forwarding) to use port 5432 should solve your issue.

Hi SMU, I already have an instance of Postgres running on 5432, hence I am not sure if it will work for me or not, Could you please provide me a sample connection object to work this with Postgres protocol?

Hi,
As you have mapped port 2000 to 4200 to access the CrateDB Admin UI, and port 5432 is already in use for your PostgreSQL, you could do something like

kubectl port-forward svc/ 2001:5432 -n crate

To map port 2001 to 5432 in CrateDB and then tell your application to connect to port 2001 instead of 2000.

1 Like

Hello @hernanc , Thanks a lot for your response, this worked for me, I am just briefing the steps I did, forwarded the port 5432 on crate service to 2001 on my local using

**kubectl port-forward svc/service-url 2001:5432 -n crate**

Then edited the connection string as follows

driver = "io.crate.client.jdbc.CrateDriver"
table = "doc.test"  
url = "jdbc:crate://127.0.0.1:2001/?user=username"
print("TABLE NAME: ", table)

#Initialize SparkSession
spark = SparkSession.builder \
    .appName("CrateDBIntegration") \
    .config("spark.jars", "/path/to/jar/file/crate-jdbc-standalone-2.7.0.jar") \
    .getOrCreate()
    
    
df = spark.read.format("jdbc").option("url", url).option("driver", driver) \
.option("dbtable", table).load()


#when using jaydebeapi, connection string used as follows

conn = jaydebeapi.connect(
    "io.crate.client.jdbc.CrateDriver",
    "jdbc:crate://127.0.0.1:2001/",
    ["user", "password"],
    "/path/to/jar file/crate-jdbc-standalone-2.7.0.jar")

once again Thanks for the support @hernanc @smu

2 Likes