How to aggregate Children to Parents

Hi, I’m trying to aggregate parent → children tables,

for example we could understand that rows from sys.Snapshot are children of sys.Repositories.

Ideally I’m trying to achieve something like:

repositories: [...[repo1, snapshots: [snapshot1, snapshot2]], ]

[
   [
     "myrepo1", {“location”:“/data/backup”}, "fs", [
        {"state": "Successful", "name": "snap1"}, 
        {"state": "Successful", "name": "snap2"}
       ] 
   ],
   [
     "myrepo2": {“location”:“/data/backup/repo2”}, "fs", []
   ]
]

Trying different stuff this is the closest thing I’d get:

Query ran at 2023-11-21T13:57:39.498Z on CrateDB 5.5.0
SELECT
  repo.*,
  (
    SELECT
      array_agg(snap.name) -- <-- Can only be given one parameter
    FROM
      sys.snapshots snap
    WHERE
      repo.name = snap.repository
  ) as snapshots
FROM
  sys.repositories repo
QUERY OK, 2 record(s) returned in 0.0016s
name settings type snapshots
myrepo2 {“location”:“/data/backup/repo2”} fs
myrepo1 {“location”:“/data/backup”} fs snap2,snap1

Any tips on how I could achieve this?

Not quite sure what exactly you are trying to do, but you could …

use a JOIN

WITH snap_repos AS (
  SELECT r.*, s.*
  FROM sys.repositories r
  LEFT JOIN sys.snapshots s ON r.name = s.repository
)
SELECT *
FROM snap_repos

and/or us OJBECTS in array_agg

-- ...
SELECT
      array_agg({name = snap.name, tables = snap.tables}) -- <-- Can only be given one parameter
-- ...
1 Like

Maybe something like this?

WITH snaps AS (
  SELECT
    "repository",
    ARRAY_AGG({"state" = "state", "name" = "name"}) snapshots
  FROM sys.snapshots
  GROUP BY "repository"
), repos AS (
  SELECT
    "name", 
    "type",
    "settings"
  FROM sys.repositories
)
SELECT
  ARRAY_AGG({
    "name" = "name",
    "type" = "type",
    "settings" = "settings",
    "snapshots" = "snapshots"
    }) as repos
FROM repos
LEFT JOIN snaps
ON repos."name" = snaps."repository";
1 Like

Thank you very much, with your examples I was able to get what I want, I didn’t know array_agg could be used like that, I couldn’t find many examples of it.

SELECT
  repo.*,
  (
    SELECT 
      ARRAY_AGG({
        "name" = snaps.name,
        "concrete_indices" = snaps.concrete_indices
         -- ... more fields
      })
    FROM
      sys.snapshots snaps
    WHERE
      repo.name = snaps.repository
  ) as snapshots
FROM
  sys.repositories repo

I wanted every row to be a repository, and inside every row a ‘snapshots’ array field with every snapshot object