SQL - cast to class java.lang.String error

Hello guys,

I encountered the following errors while trying to execute a SELECT query with ORDER BY, LIMIT and OFFSET. And I’m trying to figure out if I’m missing something in my setup configuration. Does anyone have an idea of what may be the cause of this issue?

 #An exception occurred while executing 'SELECT _id , * FROM table_name ORDER BY field_name DESC LIMIT 10 OFFSET 290':  ClassCastException[class org.apache.lucene.util.BytesRef cannot be cast to class java.lang.String (org.apache.lucene.util.BytesRef is in unnamed module of loader 'app'; java.lang.String is in module java.base of loader 'bootstrap')] Backtrace: #0

NotSerializableExceptionWrapper[class_cast_exception: class org.apache.lucene.util.BytesRef cannot be cast to class java.lang.String (org.apache.lucene.util.BytesRef is in unnamed module of loader 'app'; java.lang.String is in module java.base of loader 'bootstrap')]

In my scenario, the error occurs only when the ‘field_name’ is of type TEXT, and the OFFSET is higher than 140 of a total table row count of 297.

Environment details:
2 nodes cluster, running in docker
image: crate: 5.0.0

Any help is greatly appreciated.

Kind regards,

Thank you for reporting this issue.
I have not been able to reproduce it, there may be something specific with the table structure or the data triggering this.
Would it be an option for you to share more details like the output from

SHOW CREATE TABLE table_name;

?
Could you try perhaps different combinations of columns in the SELECT? leaving out field_name for instance?
Does it succeed to query all values of field_name with LIMIT 297 if no OFFSET is specified?

Hi Hernan,
thank you for the reply.

I receive the same error when trying to ORDER BY text_field_name without OFFSET. But if I remove the ORDER BY it works. Another thing to mention is that text_field_name is not unique

SELECT _id , * FROM table_name ORDER BY text_field_name DESC LIMIT 297 (NOT OK)
SELECT text_field_name FROM table_name LIMIT 10 OFFSET 287 (OK)
SELECT _id , * FROM table_name ORDER BY integer_field_name DESC LIMIT 297 (OK)
SELECT _id , * FROM table_name ORDER BY integer_field_name DESC limit 10 OFFSET 287 (OK)
SELECT _id , * FROM table_name LIMIT 297 (OK)

I’m wondering if this might be related with some aggregation across several indices. And end up having two fields with the same name, in different types in the same index.

SHOW CREATE TABLE table_name

CREATE TABLE IF NOT EXISTS "doc"."table_name" (
   "timestamp" BIGINT,
   "text_field_name" TEXT,
   "bigint_field_name" BIGINT,
   "integer_field_name" INTEGER,
   "text_field_name" TEXT,
   "integer_field_name" INTEGER,
   "integer_field_name" INTEGER,
   "text_field_name" TEXT,
   "integer_field_name" INTEGER,
   "text_field_name" TEXT,
   "text_field_name" TEXT,
   "text_field_name" TEXT,
   "text_field_name" TEXT,
   "text_field_name" TEXT,
   "text_field_name" TEXT,
   "text_field_name" TEXT,
   "text_field_name" TEXT,
   "text_field_name" TEXT,
   "integer_field_name" INTEGER,
   "integer_field_name" INTEGER,
   "bigint_field_name" BIGINT,
   "integer_field_name" INTEGER,
   "integer_field_name" INTEGER,
   "integer_field_name" INTEGER,
   "text_field_name" TEXT,
   "integer_field_name" INTEGER,
   "integer_field_name" INTEGER,
   "text_field_name" TEXT,
   "text_field_name" TEXT,
   "text_field_name" TEXT,
   "integer_field_name" INTEGER,
   "text_field_name" TEXT,
   "text_field_name" TEXT,
   "text_field_name" TEXT,
   "integer_field_name" INTEGER DEFAULT 0,
   "month" TIMESTAMP WITH TIME ZONE GENERATED ALWAYS AS date_trunc('month', "timestamp")
)
CLUSTERED INTO 4 SHARDS
PARTITIONED BY ("month")
WITH (
   "allocation.max_retries" = 5,
   "blocks.metadata" = false,
   "blocks.read" = false,
   "blocks.read_only" = false,
   "blocks.read_only_allow_delete" = false,
   "blocks.write" = false,
   codec = 'default',
   column_policy = 'strict',
   "mapping.total_fields.limit" = 1000,
   max_ngram_diff = 1,
   max_shingle_diff = 3,
   number_of_replicas = '0-1',
   "routing.allocation.enable" = 'all',
   "routing.allocation.total_shards_per_node" = -1,
   "store.type" = 'fs',
   "translog.durability" = 'REQUEST',
   "translog.flush_threshold_size" = 536870912,
   "translog.sync_interval" = 5000,
   "unassigned.node_left.delayed_timeout" = 60000,
   "write.wait_for_active_shards" = '1'
)

Thank you for providing these additional details.
I did further tests, including with long strings with multi-byte characters and bidirectional text and I still cannot reproduce the issue.
Would you be able to try increasing the logging level to see if we can get a full stack trace?
Logging — CrateDB: Reference
Also, would you be able to test with version 5.0.1?
Thanks again.

Thank you for the follow up.

I still reproduce the issue with version 5.0.1. If there is a way to dump the database (e.g. mysqldump) I’m keen to share with you the schema and some mocked data.

I will also try to enable the debug logs. I’ll reply here, hopefully with some more info, tomorrow or the day after tomorrow.

Thanks again for your help

Hi,
sorry to hear this is still an issue,
you can backup the data together with the schema with CREATE SNAPSHOT
or you can get just the schema with SHOW CREATE TABLE table_name like you did last week
or you can extract just the data with COPY TO

Hi,

I’ll add some more info here regarding the error trace and the table structure. I exported the table data with COPY TO command, really helpful command. But I cannot upload the json files, because it’s not allowed for new users.

Here’s an example:

{"alert_flag":18,"client_id":5,"client_name":"User 5","client_time":"1666017322","content_policy":"nGXqa","department_id":4,"destination":"iGMxz","destination_type":"NuRyL","dr_discover_id":4,"dr_discover_scan_id":33,"dr_policy_id":3,"epp_server_id":"epp-console-test-client","event_id":50,"event_time_local":"2021-12-26 19:42:05","file_name":"xMUPh","file_size":20,"hashlog_id":"tnKFW","item_details":"ViTxv","item_type":"iFlRv","last_error":"SGsrS","legacy_log_type":4,"loc_log_id":"wdhwr","log_type":"SDR","machine_id":3,"machine_name":"Computer 3","matched_item":"iIMSB","modified_at":"2021-01-01T14:48:00.000Z","nr_reports":73,"object_id":60,"policy_name":"Policy 3","timestamp":1635442108000,"status":0,"month":1633046400000}
{"alert_flag":81,"client_id":2,"client_name":"User 2","client_time":"1666017322","content_policy":"tOIgO","department_id":3,"destination":"alrbz","destination_type":"tZLfx","dr_discover_id":5,"dr_discover_scan_id":31,"dr_policy_id":4,"epp_server_id":"epp-console-test-client","event_id":41,"event_time_local":"2022-10-08 03:21:25","file_name":"aSqSb","file_size":71,"hashlog_id":"KKaBK","item_details":"RlCTF","item_type":"roHjp","last_error":"TSIdv","legacy_log_type":4,"loc_log_id":"ZOfnK","log_type":"SDR","machine_id":4,"machine_name":"Computer 4","matched_item":"pOutJ","modified_at":"2021-01-01T14:48:00.000Z","nr_reports":88,"object_id":39,"policy_name":"Policy 4","timestamp":1635427759000,"status":0,"month":1633046400000}
{"alert_flag":48,"client_id":4,"client_name":"User 4","client_time":"1666017322","content_policy":"udLuB","department_id":2,"destination":"WqmAJ","destination_type":"DbOat","dr_discover_id":4,"dr_discover_scan_id":79,"dr_policy_id":3,"epp_server_id":"epp-console-test-client","event_id":43,"event_time_local":"2021-11-22 18:58:10","file_name":"FHovS","file_size":84,"hashlog_id":"mLKSX","item_details":"WjdkI","item_type":"kMvlc","last_error":"kzkjw","legacy_log_type":4,"loc_log_id":"ogjth","log_type":"SDR","machine_id":2,"machine_name":"Computer 2","matched_item":"valCh","modified_at":"2021-01-01T14:48:00.000Z","nr_reports":4,"object_id":16,"policy_name":"Policy 3","timestamp":1634923955000,"status":0,"month":1633046400000}

Hope it helps.

Query causing the issue

SELECT _id , * FROM sdr_logs ORDER BY item_details ASC LIMIT 500

Error trace

java.lang.ClassCastException: class org.apache.lucene.util.BytesRef cannot be cast to class java.lang.String (org.apache.lucene.util.BytesRef is in unnamed module of loader 'app'; java.lang.String is in module java.base of loader 'bootstrap')
	at io.crate.types.StringType$1.rangeQuery(StringType.java:68)
	at io.crate.execution.engine.collect.collectors.OptimizeQueryForSearchAfter.apply(OptimizeQueryForSearchAfter.java:101)
	at io.crate.execution.engine.collect.collectors.OptimizeQueryForSearchAfter.apply(OptimizeQueryForSearchAfter.java:42)
	at io.crate.execution.engine.collect.collectors.LuceneOrderedDocCollector.query(LuceneOrderedDocCollector.java:215)
	at io.crate.execution.engine.collect.collectors.LuceneOrderedDocCollector.searchMore(LuceneOrderedDocCollector.java:187)
	at io.crate.execution.engine.collect.collectors.LuceneOrderedDocCollector.collect(LuceneOrderedDocCollector.java:137)
	at io.crate.execution.engine.collect.collectors.OrderedDocCollector.get(OrderedDocCollector.java:82)
	at io.crate.execution.engine.collect.collectors.OrderedLuceneBatchIteratorFactory$Factory.loadFrom(OrderedLuceneBatchIteratorFactory.java:127)
	at io.crate.execution.engine.collect.collectors.OrderedLuceneBatchIteratorFactory$Factory.tryFetchMore(OrderedLuceneBatchIteratorFactory.java:120)
	at io.crate.execution.engine.distribution.merge.BatchPagingIterator.loadNextBatch(BatchPagingIterator.java:116)
	at io.crate.data.MappedForwardingBatchIterator.loadNextBatch(MappedForwardingBatchIterator.java:51)
	at io.crate.data.LimitingBatchIterator.loadNextBatch(LimitingBatchIterator.java:69)
	at io.crate.data.MappedForwardingBatchIterator.loadNextBatch(MappedForwardingBatchIterator.java:51)
	at io.crate.data.AsyncFlatMapBatchIterator.loadNextBatch(AsyncFlatMapBatchIterator.java:111)
	at io.crate.action.sql.RowConsumerToResultReceiver.consumeIt(RowConsumerToResultReceiver.java:84)
	at io.crate.action.sql.RowConsumerToResultReceiver.lambda$consumeIt$1(RowConsumerToResultReceiver.java:86)
	at java.base/java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:863)
	at java.base/java.util.concurrent.CompletableFuture.uniWhenCompleteStage(CompletableFuture.java:887)
	at java.base/java.util.concurrent.CompletableFuture.whenComplete(CompletableFuture.java:2325)
	at java.base/java.util.concurrent.CompletableFuture.whenComplete(CompletableFuture.java:144)
	at io.crate.concurrent.KillableCompletionStage.whenComplete(KillableCompletionStage.java:232)
	at io.crate.concurrent.KillableCompletionStage.whenComplete(KillableCompletionStage.java:35)
	at io.crate.action.sql.RowConsumerToResultReceiver.consumeIt(RowConsumerToResultReceiver.java:84)
	at io.crate.action.sql.RowConsumerToResultReceiver.lambda$consumeIt$1(RowConsumerToResultReceiver.java:86)
	at java.base/java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:863)
	at java.base/java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:841)
	at java.base/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:510)
	at java.base/java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1773)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
	at java.base/java.lang.Thread.run(Thread.java:833)

Table structure

CREATE TABLE IF NOT EXISTS "doc"."sdr_logs" (
   "timestamp" BIGINT,
   "epp_server_id" TEXT,
   "object_id" BIGINT,
   "action_id" INTEGER,
   "loc_log_id" TEXT,
   "event_id" INTEGER,
   "machine_id" INTEGER,
   "machine_name" TEXT,
   "client_id" INTEGER,
   "client_name" TEXT,
   "destination_type" TEXT,
   "destination" TEXT,
   "file_name" TEXT,
   "content_policy" TEXT,
   "item_type" TEXT,
   "matched_item" TEXT,
   "item_details" TEXT,
   "event_time_local" TEXT,
   "alert_flag" INTEGER,
   "nr_reports" INTEGER,
   "file_size" BIGINT,
   "legacy_log_type" INTEGER,
   "object_state" INTEGER,
   "dr_discover_scan_id" INTEGER,
   "modified_at" TEXT,
   "dr_policy_id" INTEGER,
   "dr_discover_id" INTEGER,
   "policy_name" TEXT,
   "hashlog_id" TEXT,
   "last_error" TEXT,
   "department_id" INTEGER,
   "client_time" TEXT,
   "log_type" TEXT,
   "sent_at" TEXT,
   "status" INTEGER DEFAULT 0,
   "month" TIMESTAMP WITH TIME ZONE GENERATED ALWAYS AS date_trunc('month', "timestamp")
)
CLUSTERED INTO 4 SHARDS
PARTITIONED BY ("month")
WITH (
   "allocation.max_retries" = 5,
   "blocks.metadata" = false,
   "blocks.read" = false,
   "blocks.read_only" = false,
   "blocks.read_only_allow_delete" = false,
   "blocks.write" = false,
   codec = 'default',
   column_policy = 'strict',
   "mapping.total_fields.limit" = 1000,
   max_ngram_diff = 1,
   max_shingle_diff = 3,
   number_of_replicas = '0-1',
   "routing.allocation.enable" = 'all',
   "routing.allocation.total_shards_per_node" = -1,
   "store.type" = 'fs',
   "translog.durability" = 'REQUEST',
   "translog.flush_threshold_size" = 536870912,
   "translog.sync_interval" = 5000,
   "unassigned.node_left.delayed_timeout" = 60000,
   "write.wait_for_active_shards" = '1'
)

This could also be related with the script I’m using to insert mock data into cratedb. The script generates random numbers and strings used in the insert query. And it seems that there might be some sort of conflict between the data

I decreased the limit of the insert sequence to only a few rows per request, and I reproduced the error only when the table count reached more than 300 rows.

I imported the 3 records that you shared to one of my test environments, but I was unable to reproduce the issue with that data.

I also tried creating fake data with the cr8 tools
pip install --user cr8
cr8 insert-fake-data --hosts localhost:4200 --table doc.sdr_logs --num-records 301
SELECT _id , * FROM "doc"."sdr_logs" ORDER BY item_details DESC LIMIT 297;
SELECT _id , * FROM sdr_logs ORDER BY item_details ASC LIMIT 500

and the SELECTs worked without issues.

I could take a look at the script that you are using to generate the mock data if you want to share it, or maybe you can take a snapshot and make it available on an URL somewhere and message me the link? We also have Enterprise Support available if you have sensitive details you prefer to not share publicly here in the Community.

1 Like

Hi,
thanks again for taking the time to check this.

I found what is causing the cast error. As you may noticed, the table has partitions based on the timestamp field. I want to create a partition every month in order to be able to delete large amounts of data as quickly as possible.

I was not able to reproduce the issue with one partition. See screenshot below

But when I create several partitions, I get the cast error. See screenshot below

This is the function I use to generate the timestamps for the mocked data

function getTimestamp(): int
{
    $start = new DateTime();
    $end = new DateTime();
    $start->modify("-12 month");

    return mt_rand($start->getTimestamp(), $end->getTimestamp());
}

I’m not sure if this may occur in a prod environment. But I think it is worth checking in order to get to the bottom of what is really happening in this case

Bellow the two examples I mentioned above

query one partition: Screenshot by Lightshot (OK)
query several partitions: Screenshot by Lightshot (NOT OK)

Dear Alin,

I’ve just bumped your trust level, uploading attachments should work now.

With kind regards,
Andreas.

Hello,
Thank you Andreas

Thank you Alin for the additional details.
I just wanted to let you know that I managed to reproduce the issue and I have raised class_cast_exception in query with ORDER BY against partitioned table with replicas · Issue #13166 · crate/crate · GitHub
While we wait for that to be reviewed by the developers I would suggest if you could try working the issue around either adding more memory and heap space to your environment or changing CLUSTERED INTO 4 SHARDS to CLUSTERED INTO 2 SHARDS

2 Likes

Thank you Hernan for addressing the issue.

I did try to increase the CRATE_HEAP_SIZE to 1G, and cluster the table into 2 shards. Unfortunately, the error still persists. It works with one shard. But in this case the data is strangely duplicated, and we miss all the good stuff sharding provides.

I’m looking forward for the developers to review the ticket, and further test CrateDB.

My bad about the last remark regarding the duplicate data. I had two workers reading from the same data source, making concurrent calls to CrateDB

1 Like