I’m running a CrateDB cluster (version 6.0.3) with 4 nodes, hosting several tables, some of which are 25–200 GB in size. The core CREATE TABLE statement for one of these tables is as follows:
CREATE TABLE IF NOT EXISTS "doc"."foobar" (
"ts" TIMESTAMP WITHOUT TIME ZONE DEFAULT current_timestamp(3) NOT NULL,
"measurement" TEXT NOT NULL,
"tags" OBJECT(DYNAMIC),
"fields" OBJECT(DYNAMIC),
"partition_field" TIMESTAMP WITHOUT TIME ZONE GENERATED ALWAYS AS date_trunc('month', "ts")
)
CLUSTERED INTO 4 SHARDS
PARTITIONED BY ("partition_field")
Over time, new fields are added to the tags and fields objects via the HTTP API, primarily using bulk requests. However, I’ve encountered three unusual and non-deterministic issues:
1. Random Key Generation in OBJECT Columns
Occasionally, new keys are spontaneously created in the tags or fields objects. For example, if keys like "host" and "port" exist, a new key like "hort" might appear. The value for these keys is either derived from an existing key or empty. This behavior is inconsistent and cannot be reproduced intentionally.
2. Cross-Contamination Between tags and fields Objects
Sometimes, values intended for one object (e.g., tags) end up in the other (fields), or vice versa. This can also result in the key-generation issue described above.
3. Systematic Value Corruption for Specific Keys
The most puzzling issue involves the tags['host'] key, which typically contains an IP address (e.g., 10.2.251.112). After some operations, the value is altered in a specific pattern:
- Example:
10.2.251.112becomes10.51.11. - The digit
"2"is consistently removed from the value. If a"2"appears alone between two dots (e.g.,10.2.11), the entire column is dropped, including the dot.
This behavior is always tied to the digit "2" and occurs only in large tables (>20 GB). Smaller tables do not exhibit these issues.
Insert Query
The bulk-insert query is straightforward:
INSERT INTO "doc"."foobar" ("ts", "measurement", "tags", "fields") VALUES (?, ?, ?, ?)
Observations and Questions
- The issues occur exclusively in large tables (>20 GB).
- A PHP API client is used for all operations.
- The insert logic is unlikely to be the root cause, as 99% of operations succeed without issues.
Has anyone encountered similar behavior? Are there known issues with dynamic OBJECT columns in CrateDB 6.0.3, particularly with large tables? Could this be related to partitioning, sharding, or bulk operations?
Any insights or suggestions would be greatly appreciated!






