with theTitle as (
from title.parquet
where tconst = 'tt3890160'
),
principals as (
select array_agg({id:principal.nconst,name:primaryName,category:category})
from principal.parquet, person.parquet
where principal.tconst = (from theTitle select tconst)
and person.nconst = principal.nconst
),
characters as (
select array_agg(c.character) as characters, p.u.name
from principal_character.parquet c
join (select unnest((from principals)) as u) p
on c.character is not null and u.id=c.nconst and c.tconst=(select tconst from theTitle)
group by p.u
)
select {
title: (select primaryTitle from theTitle),
director: list_transform(
list_filter((from principals), lambda elem: elem.category='director'),
lambda elem: elem.name),
writer: list_transform(
list_filter((from principals), lambda elem: elem.category='writer'),
lambda elem: elem.name),
genres: (select genres from theTitle),
characters: (select array_agg({name:name,characters:characters}) from characters),
} as result
And if you query typeof on the result, you'll get: STRUCT(
title VARCHAR,
director VARCHAR[],
writer VARCHAR[],
genres VARCHAR,
characters STRUCT(
"name" VARCHAR,
characters VARCHAR[]
)[]
)Decoding sum types into Go interface values is obviously tricky stuff, but it gets even harder when you have recursive data structures as in an abstract syntax tree (AST). The article doesn't address this. Since there wasn't anything out there to do this, we built a little package called "unpack" as part of the SuperDB project.
The package is here...
https://github.com/brimdata/super/blob/main/pkg/unpack/refle...
and an example use in SuperDB is here...
https://github.com/brimdata/super/blob/main/compiler/ast/unp...
Sorry it's not very well documented, but once we got it working, we found the approach quite powerful and easy.
Decoding sum types into Go interface values is obviously tricky stuff, but it gets even harder when you have recursive data structures as in an abstract syntax tree (AST). The article doesn't address this. Since there wasn't anything out there to do this, we built a little package called "unpack" as part of the SuperDB project.
The package is here...
https://github.com/brimdata/super/blob/main/pkg/unpack/refle...
and an example use in SuperDB is here...
https://github.com/brimdata/super/blob/main/compiler/ast/unp...
Sorry it's not very well documented, but once we got it working, we found the approach quite powerful and easy.
"We thought of that" vs "we built it and made it work".
It's not a good ad when the error message is inadequate even in the supplied example and you need to hack around it.
error({
stage: "transform",
err: "input error",
value: {
stage: "normalize",
err: "input error",
value: {
stage: "metrics",
err: "divide by zero",
value: {
sum: 123.5,
n: 0
}
}
}
})
... and you can quickly deduce that your "metrics" stage is dividing by "n" even if n is 0 and you can fix up your logic as well as fix the errors in place after fixing the bug in the ingest pipeline.One question - the blog post covers basically debugging the ingestion of data part. My quite usual issue with older data is that at some point, you discover an issue with it (say it's slightly false, but not too much) - so you want to somehow let users know, or allow to select only the data without the issue (but still let them know how much of it they miss) - is this framework helpful in this situation ?