Is there any better way to search on the table, if you want to search on the basis of a column whose type is Array.
WITH tag_sums AS (
SELECT
value,
SUM(amount) AS total_amount
FROM (
SELECT
unnest(tags) AS value,
amount
FROM
ezetap_demo.txn
) subquery
WHERE
value IN ('10460', '10461', '5284', '5285','1','2','3','1111176')
GROUP BY
value
)
select * from tag_sums order by total_amount ;
Issues: Current query iterates over dataset who size is aroud 8k rows. If i increase the partition date range , the query is taking too much and CPU usage hits 100 %. How can I resolve this issue and please suggest a better way to use WHERE clause on a column whose type is ARRAY( without using unnest).
In the below code, if i want to group them by tags , it is taking huge time( 4-5sec) for 70K records. If I do not group them it is done in around 1 sec. Is there any way , to group them based on the value present in the array? Also, when I am grouping them , Let us say that array contains 3 elements: 102,103 ,104 and the array with which I am comparing contains [102,109,110,111,112,113]. As seen, only 102 is common. So once grouping , only 102 should come. However, when we are grouping 102,103,104 are coming in the results.But it should only show 102 . PFA query for your refernce.
WITH tag_sums AS (
SELECT
SUM(amount) AS total_amount,
value
FROM (
SELECT
unnest(tags) AS value,
amount
FROM(
select tags,amount from ezetap_demo.txn where createdts_generated > '2023-07-01' and createdts_generated < '2023-10-17' and org_code = 'ARCHANAMERCHANT' and partition_date >= '2307'
and partition_date <= '2310' and (1<>ARRAY_LENGTH(ARRAY_DIFFERENCE(['10629'],tags),1))
) AS tags_table
) subquery
GROUP BY
value
)
select * from tag_sums;
Please suggest a better alternative to group by tag’s element in array and also to resolve the bug in presenting the array element multiple times.
Hi,
You are getting these additional records because the query is missing one WHERE clause. As mentioned in my previous comment, you need one WHERE clause to lookup the records that have arrays that have one of the values you are interested in, and after unnesting you need another WHERE clause to only pick up the entries you are actually interested in, as the array could have other entries.
I also think the additional step of the tags_table subquery, which was not present in the original example is not really necessary and you could try
removing it.
One more thing you may want to consider is to restructure the table to have the tags in an array of objects:
CREATE TABLE ezetap_demo.txn (tags ARRAY(OBJECT(STRICT) AS (tag int)),amount int);
INSERT INTO ezetap_demo.txn VALUES
( ['{"tag":5}','{"tag":10460}','{"tag":6}'] ,1 )
,( ['{"tag":5}','{"tag":7}','{"tag":6}'] ,1 )
,( ['{"tag":8888}','{"tag":5284}'] ,1 );
WITH tag_sums AS (
SELECT
value,
SUM(amount) AS total_amount
FROM (
SELECT
unnest(tags)['tag'] AS value,
amount
FROM
ezetap_demo.txn
WHERE 10460=ANY(tags['tag']) OR 10461=ANY(tags['tag'])
OR 5284=ANY(tags['tag']) OR 5285=ANY(tags['tag'])
OR 1=ANY(tags['tag']) OR 2=ANY(tags['tag'])
OR 3=ANY(tags['tag']) OR 1111176=ANY(tags['tag'])
) subquery
WHERE
value IN (10460,10461,5284,5285,1,2,3,1111176)
GROUP BY
value
)
select * from tag_sums order by total_amount ;
cr> select count(*),substr(array_to_string(tags,''),0,14) as temp from ezetap.txn where id>'230801' and org_code='ATOS_PERFMERC_01' and ('AAAA-AAAD-AAAE'=ANY (tags) OR 'AAAA-AAAD-AAAF'=ANY(tags) OR 'AA
AA-AAAD-AAAG' =ANY(tags) OR 'AAAA-AAAD-AAAH'=ANY(tags) OR 'AAAA-AAAD-AAAI'=ANY(tags) OR 'AAAA-AAAD-AAAJ' =ANY(tags) OR 'AAAA-AAAD-AAAK'=ANY(tags) OR 'AAAA-AAAD-AAAL'=ANY(tags) OR 'AAAA-AAAD-AAAM'=AN
Y(tags) OR 'AAAA-AAAD-AAAN'=ANY(tags) OR 'AAAA-AAAD-AAAO' =ANY(tags) OR 'AAAA-AAAD-AAAP' =ANY(tags) OR 'AAAA-AAAD-AAAQ'=ANY(tags) OR 'AAAA-AAAD-AAAR' =ANY(tags) OR 'AAAA-AAAD-AAAS'=ANY(tags)) and pa
rtition_date >= '2308' and partition_date <= '2310' group by temp;
+----------+----------------+
As you can see , we are trying to optimize the above query. Is there any way we can turn Array into substring? ALso, is there any way where we can remove the OR ( many ORs are there) operator?
NOTE: the array elements size will be one.
Any suggestion if the above restrictions are there?
Please suggest
Do you mean each element in the array in this case is one character?
If that is the case I would recommend just using a TEXT field instead of an ARRAY.