SQL - cast to class java.lang.String error

Alin_Mihut · October 4, 2022, 6:22am

Hello guys,

I encountered the following errors while trying to execute a SELECT query with ORDER BY, LIMIT and OFFSET. And I’m trying to figure out if I’m missing something in my setup configuration. Does anyone have an idea of what may be the cause of this issue?

 #An exception occurred while executing 'SELECT _id , * FROM table_name ORDER BY field_name DESC LIMIT 10 OFFSET 290':  ClassCastException[class org.apache.lucene.util.BytesRef cannot be cast to class java.lang.String (org.apache.lucene.util.BytesRef is in unnamed module of loader 'app'; java.lang.String is in module java.base of loader 'bootstrap')] Backtrace: #0

NotSerializableExceptionWrapper[class_cast_exception: class org.apache.lucene.util.BytesRef cannot be cast to class java.lang.String (org.apache.lucene.util.BytesRef is in unnamed module of loader 'app'; java.lang.String is in module java.base of loader 'bootstrap')]

In my scenario, the error occurs only when the ‘field_name’ is of type TEXT, and the OFFSET is higher than 140 of a total table row count of 297.

Environment details:
2 nodes cluster, running in docker
image: crate: 5.0.0

Any help is greatly appreciated.

Kind regards,

hernanc · October 5, 2022, 9:31am

Thank you for reporting this issue.
I have not been able to reproduce it, there may be something specific with the table structure or the data triggering this.
Would it be an option for you to share more details like the output from

SHOW CREATE TABLE table_name;

?
Could you try perhaps different combinations of columns in the SELECT? leaving out field_name for instance?
Does it succeed to query all values of field_name with LIMIT 297 if no OFFSET is specified?

Alin_Mihut · October 5, 2022, 1:56pm

Hi Hernan,
thank you for the reply.

I receive the same error when trying to ORDER BY text_field_name without OFFSET. But if I remove the ORDER BY it works. Another thing to mention is that text_field_name is not unique

SELECT _id , * FROM table_name ORDER BY text_field_name DESC LIMIT 297 (NOT OK)
SELECT text_field_name FROM table_name LIMIT 10 OFFSET 287 (OK)
SELECT _id , * FROM table_name ORDER BY integer_field_name DESC LIMIT 297 (OK)
SELECT _id , * FROM table_name ORDER BY integer_field_name DESC limit 10 OFFSET 287 (OK)
SELECT _id , * FROM table_name LIMIT 297 (OK)

I’m wondering if this might be related with some aggregation across several indices. And end up having two fields with the same name, in different types in the same index.

SHOW CREATE TABLE table_name

CREATE TABLE IF NOT EXISTS "doc"."table_name" (
   "timestamp" BIGINT,
   "text_field_name" TEXT,
   "bigint_field_name" BIGINT,
   "integer_field_name" INTEGER,
   "text_field_name" TEXT,
   "integer_field_name" INTEGER,
   "integer_field_name" INTEGER,
   "text_field_name" TEXT,
   "integer_field_name" INTEGER,
   "text_field_name" TEXT,
   "text_field_name" TEXT,
   "text_field_name" TEXT,
   "text_field_name" TEXT,
   "text_field_name" TEXT,
   "text_field_name" TEXT,
   "text_field_name" TEXT,
   "text_field_name" TEXT,
   "text_field_name" TEXT,
   "integer_field_name" INTEGER,
   "integer_field_name" INTEGER,
   "bigint_field_name" BIGINT,
   "integer_field_name" INTEGER,
   "integer_field_name" INTEGER,
   "integer_field_name" INTEGER,
   "text_field_name" TEXT,
   "integer_field_name" INTEGER,
   "integer_field_name" INTEGER,
   "text_field_name" TEXT,
   "text_field_name" TEXT,
   "text_field_name" TEXT,
   "integer_field_name" INTEGER,
   "text_field_name" TEXT,
   "text_field_name" TEXT,
   "text_field_name" TEXT,
   "integer_field_name" INTEGER DEFAULT 0,
   "month" TIMESTAMP WITH TIME ZONE GENERATED ALWAYS AS date_trunc('month', "timestamp")
)
CLUSTERED INTO 4 SHARDS
PARTITIONED BY ("month")
WITH (
   "allocation.max_retries" = 5,
   "blocks.metadata" = false,
   "blocks.read" = false,
   "blocks.read_only" = false,
   "blocks.read_only_allow_delete" = false,
   "blocks.write" = false,
   codec = 'default',
   column_policy = 'strict',
   "mapping.total_fields.limit" = 1000,
   max_ngram_diff = 1,
   max_shingle_diff = 3,
   number_of_replicas = '0-1',
   "routing.allocation.enable" = 'all',
   "routing.allocation.total_shards_per_node" = -1,
   "store.type" = 'fs',
   "translog.durability" = 'REQUEST',
   "translog.flush_threshold_size" = 536870912,
   "translog.sync_interval" = 5000,
   "unassigned.node_left.delayed_timeout" = 60000,
   "write.wait_for_active_shards" = '1'
)

hernanc · October 6, 2022, 10:36am

Thank you for providing these additional details.
I did further tests, including with long strings with multi-byte characters and bidirectional text and I still cannot reproduce the issue.
Would you be able to try increasing the logging level to see if we can get a full stack trace?
Logging — CrateDB: Reference
Also, would you be able to test with version 5.0.1?
Thanks again.

Alin_Mihut · October 13, 2022, 3:27pm

Thank you for the follow up.

I still reproduce the issue with version 5.0.1. If there is a way to dump the database (e.g. mysqldump) I’m keen to share with you the schema and some mocked data.

I will also try to enable the debug logs. I’ll reply here, hopefully with some more info, tomorrow or the day after tomorrow.

Thanks again for your help

hernanc · October 13, 2022, 3:43pm

Hi,
sorry to hear this is still an issue,
you can backup the data together with the schema with CREATE SNAPSHOT
or you can get just the schema with SHOW CREATE TABLE table_name like you did last week
or you can extract just the data with COPY TO

Alin_Mihut · October 17, 2022, 2:59pm

Hi,

I’ll add some more info here regarding the error trace and the table structure. I exported the table data with COPY TO command, really helpful command. But I cannot upload the json files, because it’s not allowed for new users.

Here’s an example:

{"alert_flag":18,"client_id":5,"client_name":"User 5","client_time":"1666017322","content_policy":"nGXqa","department_id":4,"destination":"iGMxz","destination_type":"NuRyL","dr_discover_id":4,"dr_discover_scan_id":33,"dr_policy_id":3,"epp_server_id":"epp-console-test-client","event_id":50,"event_time_local":"2021-12-26 19:42:05","file_name":"xMUPh","file_size":20,"hashlog_id":"tnKFW","item_details":"ViTxv","item_type":"iFlRv","last_error":"SGsrS","legacy_log_type":4,"loc_log_id":"wdhwr","log_type":"SDR","machine_id":3,"machine_name":"Computer 3","matched_item":"iIMSB","modified_at":"2021-01-01T14:48:00.000Z","nr_reports":73,"object_id":60,"policy_name":"Policy 3","timestamp":1635442108000,"status":0,"month":1633046400000}
{"alert_flag":81,"client_id":2,"client_name":"User 2","client_time":"1666017322","content_policy":"tOIgO","department_id":3,"destination":"alrbz","destination_type":"tZLfx","dr_discover_id":5,"dr_discover_scan_id":31,"dr_policy_id":4,"epp_server_id":"epp-console-test-client","event_id":41,"event_time_local":"2022-10-08 03:21:25","file_name":"aSqSb","file_size":71,"hashlog_id":"KKaBK","item_details":"RlCTF","item_type":"roHjp","last_error":"TSIdv","legacy_log_type":4,"loc_log_id":"ZOfnK","log_type":"SDR","machine_id":4,"machine_name":"Computer 4","matched_item":"pOutJ","modified_at":"2021-01-01T14:48:00.000Z","nr_reports":88,"object_id":39,"policy_name":"Policy 4","timestamp":1635427759000,"status":0,"month":1633046400000}
{"alert_flag":48,"client_id":4,"client_name":"User 4","client_time":"1666017322","content_policy":"udLuB","department_id":2,"destination":"WqmAJ","destination_type":"DbOat","dr_discover_id":4,"dr_discover_scan_id":79,"dr_policy_id":3,"epp_server_id":"epp-console-test-client","event_id":43,"event_time_local":"2021-11-22 18:58:10","file_name":"FHovS","file_size":84,"hashlog_id":"mLKSX","item_details":"WjdkI","item_type":"kMvlc","last_error":"kzkjw","legacy_log_type":4,"loc_log_id":"ogjth","log_type":"SDR","machine_id":2,"machine_name":"Computer 2","matched_item":"valCh","modified_at":"2021-01-01T14:48:00.000Z","nr_reports":4,"object_id":16,"policy_name":"Policy 3","timestamp":1634923955000,"status":0,"month":1633046400000}

Hope it helps.

Query causing the issue

SELECT _id , * FROM sdr_logs ORDER BY item_details ASC LIMIT 500

Error trace

java.lang.ClassCastException: class org.apache.lucene.util.BytesRef cannot be cast to class java.lang.String (org.apache.lucene.util.BytesRef is in unnamed module of loader 'app'; java.lang.String is in module java.base of loader 'bootstrap')
	at io.crate.types.StringType$1.rangeQuery(StringType.java:68)
	at io.crate.execution.engine.collect.collectors.OptimizeQueryForSearchAfter.apply(OptimizeQueryForSearchAfter.java:101)
	at io.crate.execution.engine.collect.collectors.OptimizeQueryForSearchAfter.apply(OptimizeQueryForSearchAfter.java:42)
	at io.crate.execution.engine.collect.collectors.LuceneOrderedDocCollector.query(LuceneOrderedDocCollector.java:215)
	at io.crate.execution.engine.collect.collectors.LuceneOrderedDocCollector.searchMore(LuceneOrderedDocCollector.java:187)
	at io.crate.execution.engine.collect.collectors.LuceneOrderedDocCollector.collect(LuceneOrderedDocCollector.java:137)
	at io.crate.execution.engine.collect.collectors.OrderedDocCollector.get(OrderedDocCollector.java:82)
	at io.crate.execution.engine.collect.collectors.OrderedLuceneBatchIteratorFactory$Factory.loadFrom(OrderedLuceneBatchIteratorFactory.java:127)
	at io.crate.execution.engine.collect.collectors.OrderedLuceneBatchIteratorFactory$Factory.tryFetchMore(OrderedLuceneBatchIteratorFactory.java:120)
	at io.crate.execution.engine.distribution.merge.BatchPagingIterator.loadNextBatch(BatchPagingIterator.java:116)
	at io.crate.data.MappedForwardingBatchIterator.loadNextBatch(MappedForwardingBatchIterator.java:51)
	at io.crate.data.LimitingBatchIterator.loadNextBatch(LimitingBatchIterator.java:69)
	at io.crate.data.MappedForwardingBatchIterator.loadNextBatch(MappedForwardingBatchIterator.java:51)
	at io.crate.data.AsyncFlatMapBatchIterator.loadNextBatch(AsyncFlatMapBatchIterator.java:111)
	at io.crate.action.sql.RowConsumerToResultReceiver.consumeIt(RowConsumerToResultReceiver.java:84)
	at io.crate.action.sql.RowConsumerToResultReceiver.lambda$consumeIt$1(RowConsumerToResultReceiver.java:86)
	at java.base/java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:863)
	at java.base/java.util.concurrent.CompletableFuture.uniWhenCompleteStage(CompletableFuture.java:887)
	at java.base/java.util.concurrent.CompletableFuture.whenComplete(CompletableFuture.java:2325)
	at java.base/java.util.concurrent.CompletableFuture.whenComplete(CompletableFuture.java:144)
	at io.crate.concurrent.KillableCompletionStage.whenComplete(KillableCompletionStage.java:232)
	at io.crate.concurrent.KillableCompletionStage.whenComplete(KillableCompletionStage.java:35)
	at io.crate.action.sql.RowConsumerToResultReceiver.consumeIt(RowConsumerToResultReceiver.java:84)
	at io.crate.action.sql.RowConsumerToResultReceiver.lambda$consumeIt$1(RowConsumerToResultReceiver.java:86)
	at java.base/java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:863)
	at java.base/java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:841)
	at java.base/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:510)
	at java.base/java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1773)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
	at java.base/java.lang.Thread.run(Thread.java:833)

Table structure

CREATE TABLE IF NOT EXISTS "doc"."sdr_logs" (
   "timestamp" BIGINT,
   "epp_server_id" TEXT,
   "object_id" BIGINT,
   "action_id" INTEGER,
   "loc_log_id" TEXT,
   "event_id" INTEGER,
   "machine_id" INTEGER,
   "machine_name" TEXT,
   "client_id" INTEGER,
   "client_name" TEXT,
   "destination_type" TEXT,
   "destination" TEXT,
   "file_name" TEXT,
   "content_policy" TEXT,
   "item_type" TEXT,
   "matched_item" TEXT,
   "item_details" TEXT,
   "event_time_local" TEXT,
   "alert_flag" INTEGER,
   "nr_reports" INTEGER,
   "file_size" BIGINT,
   "legacy_log_type" INTEGER,
   "object_state" INTEGER,
   "dr_discover_scan_id" INTEGER,
   "modified_at" TEXT,
   "dr_policy_id" INTEGER,
   "dr_discover_id" INTEGER,
   "policy_name" TEXT,
   "hashlog_id" TEXT,
   "last_error" TEXT,
   "department_id" INTEGER,
   "client_time" TEXT,
   "log_type" TEXT,
   "sent_at" TEXT,
   "status" INTEGER DEFAULT 0,
   "month" TIMESTAMP WITH TIME ZONE GENERATED ALWAYS AS date_trunc('month', "timestamp")
)
CLUSTERED INTO 4 SHARDS
PARTITIONED BY ("month")
WITH (
   "allocation.max_retries" = 5,
   "blocks.metadata" = false,
   "blocks.read" = false,
   "blocks.read_only" = false,
   "blocks.read_only_allow_delete" = false,
   "blocks.write" = false,
   codec = 'default',
   column_policy = 'strict',
   "mapping.total_fields.limit" = 1000,
   max_ngram_diff = 1,
   max_shingle_diff = 3,
   number_of_replicas = '0-1',
   "routing.allocation.enable" = 'all',
   "routing.allocation.total_shards_per_node" = -1,
   "store.type" = 'fs',
   "translog.durability" = 'REQUEST',
   "translog.flush_threshold_size" = 536870912,
   "translog.sync_interval" = 5000,
   "unassigned.node_left.delayed_timeout" = 60000,
   "write.wait_for_active_shards" = '1'
)

Alin_Mihut · October 17, 2022, 3:21pm

This could also be related with the script I’m using to insert mock data into cratedb. The script generates random numbers and strings used in the insert query. And it seems that there might be some sort of conflict between the data

I decreased the limit of the insert sequence to only a few rows per request, and I reproduced the error only when the table count reached more than 300 rows.

hernanc · October 17, 2022, 4:10pm

I imported the 3 records that you shared to one of my test environments, but I was unable to reproduce the issue with that data.

I also tried creating fake data with the cr8 tools
pip install --user cr8
cr8 insert-fake-data --hosts localhost:4200 --table doc.sdr_logs --num-records 301
SELECT _id , * FROM "doc"."sdr_logs" ORDER BY item_details DESC LIMIT 297;
SELECT _id , * FROM sdr_logs ORDER BY item_details ASC LIMIT 500

and the SELECTs worked without issues.

I could take a look at the script that you are using to generate the mock data if you want to share it, or maybe you can take a snapshot and make it available on an URL somewhere and message me the link? We also have Enterprise Support available if you have sensitive details you prefer to not share publicly here in the Community.

Alin_Mihut · October 18, 2022, 9:32am

Hi,
thanks again for taking the time to check this.

I found what is causing the cast error. As you may noticed, the table has partitions based on the timestamp field. I want to create a partition every month in order to be able to delete large amounts of data as quickly as possible.

I was not able to reproduce the issue with one partition. See screenshot below

But when I create several partitions, I get the cast error. See screenshot below

This is the function I use to generate the timestamps for the mocked data

function getTimestamp(): int
{
    $start = new DateTime();
    $end = new DateTime();
    $start->modify("-12 month");

    return mt_rand($start->getTimestamp(), $end->getTimestamp());
}

I’m not sure if this may occur in a prod environment. But I think it is worth checking in order to get to the bottom of what is really happening in this case

Alin_Mihut · October 18, 2022, 9:41am

Bellow the two examples I mentioned above

query one partition: Screenshot by Lightshot (OK)
query several partitions: Screenshot by Lightshot (NOT OK)

amotl · October 18, 2022, 9:54am

Dear Alin,

I’ve just bumped your trust level, uploading attachments should work now.

With kind regards,
Andreas.

Alin_Mihut · October 18, 2022, 11:17am

Hello,
Thank you Andreas

hernanc · October 19, 2022, 10:01am

Thank you Alin for the additional details.
I just wanted to let you know that I managed to reproduce the issue and I have raised class_cast_exception in query with ORDER BY against partitioned table with replicas · Issue #13166 · crate/crate · GitHub
While we wait for that to be reviewed by the developers I would suggest if you could try working the issue around either adding more memory and heap space to your environment or changing CLUSTERED INTO 4 SHARDS to CLUSTERED INTO 2 SHARDS

Alin_Mihut · October 19, 2022, 2:39pm

Thank you Hernan for addressing the issue.

I did try to increase the CRATE_HEAP_SIZE to 1G, and cluster the table into 2 shards. Unfortunately, the error still persists. It works with one shard. But in this case the data is strangely duplicated, and we miss all the good stuff sharding provides.

I’m looking forward for the developers to review the ticket, and further test CrateDB.

Alin_Mihut · October 19, 2022, 3:01pm

My bad about the last remark regarding the duplicate data. I had two workers reading from the same data source, making concurrent calls to CrateDB

Topic		Replies	Views
IllegalStateException with left join query CrateDB	8	612	March 3, 2022
SQLParseException when opening CrateDB connection in java Integrations	6	1373	October 18, 2020
CrateDB JDBC driver and PySpark Integrations	1	2986	July 23, 2021
Limited performance during query CrateDB	15	1177	May 26, 2021
Table creation considerations CrateDB	7	557	July 14, 2022

SQL - cast to class java.lang.String error

Related topics