Unable to trigger duplicate key error with multi-record insert

I’m using a cratedb cloud free tier cluster for testing.
Not sure whether the bug is with cratedb or asyncpg (or whether it’s a bug at all), here’s the code:

CREATE TABLE foo (
    id integer,
    type integer,
    res integer,
    ts timestamp without time zone,
    value real,
    CONSTRAINT c PRIMARY KEY (id, type, res, ts)
);

Python/asyncpg code:

async def run():

    conn = await asyncpg.connect(host="xxxx.xxxx..eu-west-1.aws.cratedb.net", database="doc", port=5432, user=USERNAME, password=PASSWORD, ssl=True)

    try:
        records = [
          (1, 2, 3, datetime.datetime(2024, 1, 1, 0, 1, 0), 50.5),
          (1, 2, 4, datetime.datetime(2024, 1, 1, 0, 1, 0), 50.5)
        ]

        while True:        
          result = await conn.executemany('''INSERT INTO foo (id, type, res, ts, value) VALUES ($1, $2, $3, $4, $5);''', records)
          time.sleep(1)

    except Exception as e:
        print(f"Exception: {e}")

    finally:
        await conn.close()

asyncio.run(run())

This code runs forever without errors, while I would expect it to produce a duplicate key exception on the second iteration already.

Indeed, if after the above has run I execute this from the cratedb console (single record insert):

insert into foo (id, type, res, ts, value) values (1, 2, 3, '2024-01-01 00:01:00', 50.5);

I do get the duplicate key error.

However, if I run the following from the console (multiple record insert):

insert into foo (id, type, res, ts, value) values (1, 2, 3, '2024-01-01 00:01:00', 50.5), (1, 2, 4, '2024-01-01 00:01:00', 50.5);

I don’t get any error: INSERT OK, 0 records affected (0.006 seconds).

And if I do the following (multiple record insert with ON CONFLICT clause):

insert into foo (id, type, res, ts, value) values (1, 2, 3, '2024-01-01 00:01:00', 500.5), (1, 2, 4, '2024-01-01 00:01:00', 500.5) ON CONFLICT (id, type, res, ts) DO UPDATE SET value = excluded.value;

the update is performed: INSERT OK, 2 records affected (0.007 seconds)

So is this some kind of expected behavior, or the result of some configuration/optimization/tweaking of the cloud instance?

This is to be expected, if the tuple only has one row, it returns DuplicateKeyException. When there are more statements CrateDB ignores errors, we do not have transactions so no rollback.

If one row fails of a bulk insert, throwing an error on the whole operation is unhelpful since there could be already inserted rows.

Maybe we should put up a Note in Insert Methods — CrateDB: Guide, since the only related reference is in the UNNEST

1 Like

Well, postgres throws an error even when using bulk inserts, so that’s perhaps why I was expecting the same behavior. But no problem, as long as users are aware of cratedb’s behavior. Thanks!