Hi guys
I have some issues to ingest data with python and the asyncpg package. I have a dataframe with following datatypes:
int32, datetime64[ns, UTC], object, float64, int32
…and want to ingest with:
await connection.copy_records_to_table(my_table_name, records=df.itertuples(index=False), columns=df.columns.to_list(), timeout=10)
I receive following error:
---------------------------------------------------------------------------
InternalServerError Traceback (most recent call last)
Cell In[229], line 5
1 connection = await pool.acquire(timeout=60)
3 table = "dataset"
----> 5 s = await connection.copy_records_to_table(table, records=df.itertuples(index=False), columns=df.columns.to_list(), timeout=10)
File c:\Users\SCPA\Azure_DevOps\ALD_Project\venv_app\Lib\site-packages\asyncpg\connection.py:1081, in Connection.copy_records_to_table(self, table_name, records, columns, schema_name, timeout, where)
1076 opts = '(FORMAT binary)'
1078 copy_stmt = 'COPY {tab}{cols} FROM STDIN {opts} {cond}'.format(
1079 tab=tabname, cols=cols, opts=opts, cond=cond)
-> 1081 return await self._protocol.copy_in(
1082 copy_stmt, None, None, records, intro_ps._state, timeout)
File c:\Users\SCPA\Azure_DevOps\ALD_Project\venv_app\Lib\site-packages\asyncpg\protocol\protocol.pyx:565, in copy_in()
InternalServerError: line 1:114: extraneous input 'binary' expecting {',', ')'}
I use following versions:
asyncpg 0.29.0
sqlalchemy-cratedb 0.37.0
BTW: the ingestion with your crate package works fine, but I would like to switch to asyncpg because its async, I can use parallel connections and an connection pool. What I have seen for the crate package that when you define multiple hosts in the connection and one of them will be disconnected it will removed from the list and not automatically included again if connection to the node is again available.
Thanks for your help in advance
Regards Schabi