Hello,
I am currently running a 3 node Crate cluster (on Kubernetes) version 4.6.6. I have started monitoring CPU usage with Prometheus and I can see every day a CPU consumption spike between 16:30 and 16:40 on one of the pods. The pod is the one marked as master in the Crate management web interface. Looking at the logs of the pod I can see the following exception:
[2022-10-18T16:40:14,239][ERROR][i.c.s.TableStatsService ] [crate-bemp-local-1] Error running periodic ANALYZE
java.util.concurrent.CompletionException: org.elasticsearch.transport.RemoteTransportException: [crate-bemp-local-1][10.2.4.193:4300][internal:crate:sql/analyze/fetch_samples]
at java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:331) ~[?:?]
at java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:346) ~[?:?]
at java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:632) ~[?:?]
at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:506) ~[?:?]
at java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:2158) ~[?:?]
at io.crate.action.FutureActionListener.onFailure(FutureActionListener.java:52) ~[crate-server.jar:?]
at io.crate.execution.support.MultiActionListener.countdown(MultiActionListener.java:85) ~[crate-server.jar:?]
at io.crate.execution.support.MultiActionListener.onResponse(MultiActionListener.java:70) ~[crate-server.jar:?]
at org.elasticsearch.action.ActionListenerResponseHandler.handleResponse(ActionListenerResponseHandler.java:57) ~[crate-server.jar:?]
at org.elasticsearch.transport.TransportService$TimeoutResponseHandler.handleResponse(TransportService.java:1040) ~[crate-server.jar:?]
at org.elasticsearch.transport.InboundHandler$1.doRun(InboundHandler.java:224) ~[crate-server.jar:?]
at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) ~[crate-server.jar:?]
at org.elasticsearch.common.util.concurrent.EsExecutors$DirectExecutor.execute(EsExecutors.java:161) ~[crate-server.jar:?]
at org.elasticsearch.transport.InboundHandler.handleResponse(InboundHandler.java:216) ~[crate-server.jar:?]
at org.elasticsearch.transport.InboundHandler.messageReceived(InboundHandler.java:135) ~[crate-server.jar:?]
at org.elasticsearch.transport.InboundHandler.inboundMessage(InboundHandler.java:104) ~[crate-server.jar:?]
at org.elasticsearch.transport.TcpTransport.inboundMessage(TcpTransport.java:704) ~[crate-server.jar:?]
at org.elasticsearch.transport.netty4.Netty4MessageChannelHandler.channelRead(Netty4MessageChannelHandler.java:53) ~[crate-server.jar:?]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) ~[netty-transport-4.1.65.Final.jar:4.1.65.Final]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) ~[netty-transport-4.1.65.Final.jar:4.1.65.Final]
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357) ~[netty-transport-4.1.65.Final.jar:4.1.65.Final]
at io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:324) ~[netty-codec-4.1.65.Final.jar:4.1.65.Final]
at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:296) ~[netty-codec-4.1.65.Final.jar:4.1.65.Final]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) ~[netty-transport-4.1.65.Final.jar:4.1.65.Final]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) ~[netty-transport-4.1.65.Final.jar:4.1.65.Final]
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357) ~[netty-transport-4.1.65.Final.jar:4.1.65.Final]
at io.netty.handler.logging.LoggingHandler.channelRead(LoggingHandler.java:271) ~[netty-handler-4.1.65.Final.jar:4.1.65.Final]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) ~[netty-transport-4.1.65.Final.jar:4.1.65.Final]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) ~[netty-transport-4.1.65.Final.jar:4.1.65.Final]
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357) ~[netty-transport-4.1.65.Final.jar:4.1.65.Final]
at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1410) ~[netty-transport-4.1.65.Final.jar:4.1.65.Final]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) ~[netty-transport-4.1.65.Final.jar:4.1.65.Final]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) ~[netty-transport-4.1.65.Final.jar:4.1.65.Final]
at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:919) ~[netty-transport-4.1.65.Final.jar:4.1.65.Final]
at io.netty.channel.epoll.AbstractEpollStreamChannel$EpollStreamUnsafe.epollInReady(AbstractEpollStreamChannel.java:795) ~[netty-transport-native-epoll-4.1.65.Final-linux-x86_64.jar:4.1.65.Final]
at io.netty.channel.epoll.EpollEventLoop.processReady(EpollEventLoop.java:480) ~[netty-transport-native-epoll-4.1.65.Final-linux-x86_64.jar:4.1.65.Final]
at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:378) ~[netty-transport-native-epoll-4.1.65.Final-linux-x86_64.jar:4.1.65.Final]
at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989) ~[netty-common-4.1.65.Final.jar:4.1.65.Final]
at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) ~[netty-common-4.1.65.Final.jar:4.1.65.Final]
at java.lang.Thread.run(Thread.java:831) [?:?]
Caused by: org.elasticsearch.transport.RemoteTransportException: [crate-bemp-local-1][10.2.4.193:4300][internal:crate:sql/analyze/fetch_samples]
Caused by: java.lang.IllegalStateException: unexpected docvalues type NONE for field 'exception' (expected one of [SORTED, SORTED_SET]). Re-index with correct docvalues type.
at org.apache.lucene.index.DocValues.checkField(DocValues.java:317) ~[lucene-core-8.8.2.jar:8.8.2 a92a05e195b775b30ca410bc0a26e8e79e7b3bfb - mdrob - 2021-04-06 16:33:27]
at org.apache.lucene.index.DocValues.getSortedSet(DocValues.java:410) ~[lucene-core-8.8.2.jar:8.8.2 a92a05e195b775b30ca410bc0a26e8e79e7b3bfb - mdrob - 2021-04-06 16:33:27]
at io.crate.expression.reference.doc.lucene.BytesRefColumnReference.setNextReader(BytesRefColumnReference.java:69) ~[crate-server.jar:?]
at io.crate.statistics.ReservoirSampler$DocIdToRow.apply(ReservoirSampler.java:255) ~[crate-server.jar:?]
at io.crate.statistics.ReservoirSampler.getSamples(ReservoirSampler.java:215) ~[crate-server.jar:?]
at io.crate.statistics.ReservoirSampler.getSamples(ReservoirSampler.java:126) ~[crate-server.jar:?]
at io.crate.statistics.TransportAnalyzeAction.lambda$new$1(TransportAnalyzeAction.java:114) ~[crate-server.jar:?]
at io.crate.execution.support.NodeActionRequestHandler.messageReceived(NodeActionRequestHandler.java:49) [crate-server.jar:?]
at org.elasticsearch.transport.RequestHandlerRegistry.processMessageReceived(RequestHandlerRegistry.java:63) [crate-server.jar:?]
at org.elasticsearch.transport.TransportService$6.doRun(TransportService.java:698) [crate-server.jar:?]
at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) [crate-server.jar:?]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130) [?:?]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:630) [?:?]
... 1 more
I think just after this message the CPU consumption goes to normal. Here is the CPU consumption chart (the unit are millicores):
Should I need to be worried about this error? Is it normal? As said, is happening every day.
Best regards,