Couldn't complete updating a table with more than 200 million records

We are using crateDB 5.5.2, and there is a table pgdb_scm.pms_pms_short_of_material_log_l with more than 200 million records, below sql has been running for over a day without completing.

update  pgdb_scm.pms_pms_short_of_material_log_l
        set new_qty_delivery_limit_stock = qty_delivery_limit_stock
        where qty_delivery_limit_stock is not null;

Table definition.

cr> show create table pgdb_scm.pms_pms_short_of_material_log_l;
+---------------------------------------------------------------------------+
| SHOW CREATE TABLE pgdb_scm.pms_pms_short_of_material_log_l                |
+---------------------------------------------------------------------------+
| CREATE TABLE IF NOT EXISTS "pgdb_scm"."pms_pms_short_of_material_log_l" ( |
|    "id" VARCHAR(36) NOT NULL,                                             |
|    "version_id" VARCHAR(36),                                              |
|    "version_time" VARCHAR(100),                                           |
|    "allocate_seq" INTEGER,                                                |
|    "org_id" VARCHAR(100),                                                 |
|    "org_code" VARCHAR(100),                                               |
|    "org_name" VARCHAR(200),                                               |
|    "ebs_organization_id" VARCHAR(100),                                    |
|    "item_id" VARCHAR(36),                                                 |
|    "item_code" VARCHAR(50),                                               |
|    "factory_code" VARCHAR(50),                                            |
|    "ebs_factory_id" VARCHAR(36),                                          |
|    "bill_no" VARCHAR(200),                                                |
|    "bill_id" VARCHAR(200),                                                |
|    "bill_status" VARCHAR(200),                                            |
|    "lot_number" VARCHAR(200),                                             |
|    "product_id" VARCHAR(200),                                             |
|    "product_code" VARCHAR(50),                                            |
|    "product_model" VARCHAR(100),                                          |
|    "req_date" VARCHAR(48),                                                |
|    "req_qty" INTEGER,                                                     |
|    "qty_mo" INTEGER,                                                      |
|    "qty_issued" INTEGER,                                                  |
|    "qty_remain_issued" INTEGER,                                           |
|    "qty_onhand_remain" INTEGER,                                           |
|    "qty_short_onhand" INTEGER,                                            |
|    "om_user_name" VARCHAR(200),                                           |
|    "assistant_audit_time" VARCHAR(48),                                    |
|    "to_pur_time" VARCHAR(48),                                             |
|    "customer_name" VARCHAR(300),                                          |
|    "final_customer_name" VARCHAR(300),                                    |
|    "req_type_name" VARCHAR(200),                                          |
|    "qty_short_theory" INTEGER,                                            |
|    "qty_short_asn" VARCHAR(50),                                           |
|    "qty_short_plan" VARCHAR(50),                                          |
|    "client_org_id" VARCHAR(50),                                           |
|    "plan_ship_date" VARCHAR(48),                                          |
|    "sale_area" VARCHAR(200),                                              |
|    "trial_production" VARCHAR(200),                                       |
|    "bill_class_code" VARCHAR(200),                                        |
|    "promise_date" VARCHAR(48),                                            |
|    "online_time" VARCHAR(48),                                             |
|    "req_source_type" VARCHAR(200),                                        |
|    "can_replace_flag" VARCHAR(100),                                       |
|    "cus_model" VARCHAR(500),                                              |
|    "can_replace_items" VARCHAR(2000),                                     |
|    "plan_change_flag" VARCHAR(200),                                       |
|    "real_customer" VARCHAR(500),                                          |
|    "item_first_in" VARCHAR(30),                                           |
|    "item_lot_no" VARCHAR(100),                                            |
|    "req_remark" VARCHAR(2000),                                            |
|    "project_num" VARCHAR(200),                                            |
|    "source_req_no" TEXT,                                                  |
|    "is_promise" VARCHAR(200),                                             |
|    "client_factory_no" VARCHAR(200),                                      |
|    "mo_type" VARCHAR(200),                                                |
|    "source_req_type_name" VARCHAR(200),                                   |
|    "ebs_mo_id" VARCHAR(200),                                              |
|    "bu_name" VARCHAR(200),                                                |
|    "so_group_id" TEXT,                                                    |
|    "mo_mode" VARCHAR(50),                                                 |
|    "mo_creation_time" VARCHAR(48),                                        |
|    "transfer_doc_no" VARCHAR(200),                                        |
|    "crt_time" VARCHAR(48),                                                |
|    "crt_user" VARCHAR(50),                                                |
|    "crt_name" VARCHAR(50),                                                |
|    "crt_host" VARCHAR(100),                                               |
|    "upd_time" VARCHAR(48),                                                |
|    "upd_user" VARCHAR(100),                                               |
|    "upd_name" VARCHAR(50),                                                |
|    "upd_host" VARCHAR(100),                                               |
|    "version" BIGINT,                                                      |
|    "product_desc" VARCHAR(1000),                                          |
|    "refresh_batch_id" VARCHAR(36),                                        |
|    "is_need_delivery" VARCHAR(36),                                        |
|    "change_order_source_id" VARCHAR(200),                                 |
|    "item_desc" VARCHAR(1000),                                             |
|    "item_status" VARCHAR(200),                                            |
|    "item_class" VARCHAR(500),                                             |
|    "item_type_inv" VARCHAR(500),                                          |
|    "item_type_buyer" VARCHAR(500),                                        |
|    "buyer_group" VARCHAR(300),                                            |
|    "buyer_main_user" VARCHAR(200),                                        |
|    "replace_group" VARCHAR(200),                                          |
|    "vendor_name_po" VARCHAR(200),                                         |
|    "vendor_name_asl" VARCHAR(50),                                         |
|    "fixed_lot_multiplier" INTEGER,                                        |
|    "qty_last_30_item" INTEGER,                                            |
|    "qty_re_wip_stock" INTEGER,                                            |
|    "qty_re_so_stock" INTEGER,                                             |
|    "qty_re_so_stock_ava_day" INTEGER,                                     |
|    "qty_re_forcast_stock" INTEGER,                                        |
|    "qty_sample" INTEGER,                                                  |
|    "pur_cycle" INTEGER,                                                   |
|    "last_stockin_time" VARCHAR(48),                                       |
|    "apply_dept" VARCHAR(240),                                             |
|    "item_type" VARCHAR(300),                                              |
|    "qty_fore_re_so_stock" INTEGER,                                        |
|    "change_level" VARCHAR(200),                                           |
|    "risk_warning" VARCHAR(50),                                            |
|    "qty_onhand_item" INTEGER,                                             |
|    "qty_wait_check_item" INTEGER,                                         |
|    "n_days_reqs_ontime" INTEGER,                                          |
|    "planning_make_buy_code" VARCHAR(300),                                 |
|    "buyer_main_account" VARCHAR(300),                                     |
|    "sourcing_strategy" VARCHAR(300),                                      |
|    "qty_can_release_item" INTEGER,                                        |
|    "bom_full" VARCHAR(300),                                               |
|    "qty_delivery_limit_stock" INTEGER,                                    |
|    "qty_so_req" INTEGER,                                                  |
|    "qty_wip_req" INTEGER,                                                 |
|    "qty_pur_onload_item" INTEGER,                                         |
|    "ckd" VARCHAR(300),                                                    |
|    "is_pack_flag" VARCHAR(50),                                            |
|    "supply_org_code" VARCHAR(300),                                        |
|    "source_approver" VARCHAR(256),                                        |
|    "old_material_spec" VARCHAR(600),                                      |
|    "item_code_old" VARCHAR(300),                                          |
|    "supply_org_name" VARCHAR(300),                                        |
|    "qty_asn_wait_rec" INTEGER,                                            |
|    "qty_wait_check_factory" INTEGER,                                      |
|    "qty_rele_wait_asn" INTEGER,                                           |
|    "qty_transfer_onload" INTEGER,                                         |
|    "qty_can_release_factory" INTEGER,                                     |
|    "qty_allocate_in" INTEGER,                                             |
|    "qty_allocate_out" INTEGER,                                            |
|    "qty_rec_exception" INTEGER,                                           |
|    "delivery_date" VARCHAR(48),                                           |
|    "delivery_no" TEXT,                                                    |
|    "delivery_method" TEXT,                                                |
|    "tracking_number" TEXT,                                                |
|    "delivery_contact" TEXT,                                               |
|    "qty_onhand_special" INTEGER,                                          |
|    "qty_onhand_factory" INTEGER,                                          |
|    "shipped_date" VARCHAR(48),                                            |
|    "qty_last_30_factory" INTEGER,                                         |
|    "qty_safe_stock" INTEGER,                                              |
|    "is_agent_purchase" VARCHAR(100),                                      |
|    "jit_flag" VARCHAR(50),                                                |
|    "qty_allocate_wait_check" INTEGER,                                     |
|    "qty_onhand_bad" INTEGER,                                              |
|    "qty_exceed_pcb" INTEGER,                                              |
|    "qty_other_in_uncheck" INTEGER,                                        |
|    "qty_asn_check" INTEGER,                                               |
|    "qty_other_check" INTEGER,                                             |
|    "qty_not_approved" INTEGER,                                            |
|    "qty_approved" INTEGER,                                                |
|    "latest_address_node" TEXT,                                            |
|    "qty_pur_onload_factory" INTEGER,                                      |
|    "jit_vendor_name" VARCHAR(512),                                        |
|    "remark" VARCHAR(4000),                                                |
|    "inner_remark" VARCHAR(4000),                                          |
|    "planned_delivery_time" VARCHAR(48),                                   |
|    "auto_reply" VARCHAR(4000),                                            |
|    "is_locked" VARCHAR(50),                                               |
|    "promise_match_note" TEXT,                                             |
|    "shortage_resp" VARCHAR(4000),                                         |
|    "item_no_new" VARCHAR(200),                                            |
|    "item_id_new" VARCHAR(50),                                             |
|    "sample_bill_info" TEXT,                                               |
|    "pur_change_bill_info" TEXT,                                           |
|    "alarm_id" VARCHAR(36),                                                |
|    "alarm_no" VARCHAR(128),                                               |
|    "alarm_status" VARCHAR(128),                                           |
|    "alarm_reason_code" VARCHAR(200),                                      |
|    "is_keep_plan_ship_date" VARCHAR(10),                                  |
|    "last_ship_date" VARCHAR(48),                                          |
|    "last_req_date" VARCHAR(48),                                           |
|    "number_of_alarm" BIGINT,                                              |
|    "advance_alarm_days" INTEGER,                                          |
|    "warning" VARCHAR(4000),                                               |
|    "changed_flag" VARCHAR(36),                                            |
|    "last_schedule_date" VARCHAR(48),                                      |
|    "part_family" TEXT,                                                    |
|    "factory_id" TEXT,                                                     |
|    "first_version_release_time" TEXT,                                     |
|    "latest_version_release_time" TEXT,                                    |
|    "uom" TEXT,                                                            |
|    "qty_re_expect_stock_sg" INTEGER,                                      |
|    "replace_set_group" VARCHAR(1000),                                     |
|    "new_qty_delivery_limit_stock" BIGINT,                                 |
|    PRIMARY KEY ("id")                                                     |
| )                                                                         |
| CLUSTERED BY ("id") INTO 4 SHARDS                                         |
| WITH (                                                                    |
|    "allocation.max_retries" = 5,                                          |
|    "blocks.metadata" = false,                                             |
|    "blocks.read" = false,                                                 |
|    "blocks.read_only" = false,                                            |
|    "blocks.read_only_allow_delete" = false,                               |
|    "blocks.write" = false,                                                |
|    codec = 'default',                                                     |
|    column_policy = 'strict',                                              |
|    "mapping.total_fields.limit" = 1000,                                   |
|    max_ngram_diff = 1,                                                    |
|    max_shingle_diff = 3,                                                  |
|    number_of_replicas = '0-1',                                            |
|    "routing.allocation.enable" = 'all',                                   |
|    "routing.allocation.total_shards_per_node" = -1,                       |
|    "store.type" = 'fs',                                                   |
|    "translog.durability" = 'REQUEST',                                     |
|    "translog.flush_threshold_size" = 536870912,                           |
|    "translog.sync_interval" = 5000,                                       |
|    "unassigned.node_left.delayed_timeout" = 60000,                        |
|    "write.wait_for_active_shards" = '1'                                   |
| )                                                                         |
+---------------------------------------------------------------------------+```

How many records are in the table where the value is not null?

SELECT COUNT(*)
FROM pgdb_scm.pms_pms_short_of_material_log_l
WHERE qty_delivery_limit_stock is not null;

An update across 200 Mio records needs to rewrite / reindex all 200 Mio records. Depending on the cluster size this might take a while

Our requirement is modify the type of column qty_delivery_limit_stock, while I have to add a new column new_qty_delivery_limit_stock to replace qty_delivery_limit_stock since CrateDB doesn’t support modifing column types.

cr> SELECT COUNT(*)
    FROM pgdb_scm.pms_pms_short_of_material_log_l
    WHERE qty_delivery_limit_stock is not null;
+-----------+
|  count(*) |
+-----------+
| 298154844 |
+-----------+

On this case I would probably:

  • upgrade to one of the latest versions
  • review the overload protection settings
  • and break this down into batches of id values, for instance:
INSERT INTO pgdb_scm.pms_pms_short_of_material_log_l (id,new_qty_delivery_limit_stock)
SELECT id,qty_delivery_limit_stock
FROM pgdb_scm.pms_pms_short_of_material_log_l
WHERE pms_pms_short_of_material_log_l.id like '1%'
ON CONFLICT (ID) 
DO UPDATE SET new_qty_delivery_limit_stock=qty_delivery_limit_stock;