bertails (u/bertails)

bertails commented on Most RESTful APIs aren't really RESTful florian-kraemer.net//soft... · Posted by u/BerislavLopac

bertails · 2 months ago

"REST" is our industry's most successful collective delusion: everyone knows it's wrong, everyone knows we're using it wrong, and somehow that works better than being right.

bertails commented on Model Once, Represent Everywhere: UDA (Unified Data Architecture) at Netflix netflixtechblog.com/uda-u... · Posted by u/Bogdanp

b0a04gl · 2 months ago

got it, thanks. makes sense that it depends on the projection target. SHACL+SPARQL seems like the strongest runtime check layer then. for projections like graphql or java where enforcement is weaker, is there any way to inject runtime guards or contract tests as part of the generated code? or is the idea to keep enforcement external and just let uda define the schema canonically?

bertails · 2 months ago

We have a little bit of runtime checks in the Java projections but it's still tied to SHACL/SPARQL through the Jena library. We are exploring ways to keep using SHACL at scale for advanced data profiling, and also try to identify a subset of SHACL constraints which would be able to compile down to SQL constraints for validating directly against the data warehouse.

bertails commented on Model Once, Represent Everywhere: UDA (Unified Data Architecture) at Netflix netflixtechblog.com/uda-u... · Posted by u/Bogdanp

borromakot · 2 months ago

The concept is you model the core of your application and build it at the same time, using declarative tools, and project additions layers from this definition. The underlying data model is extendable via, well, extensions. These extend the DSL schema.

It's not conceptually a knowledge graph in the same way, but you can introspect essentially everything about your application. However, resources can be given data layers which define how they map to underlying storage, and you could use all of this information only as static information to derive additional things from, or you could just...well, use it. i.e `Ash.read(Resource)` yielding the table data. Our query engine has the same semantics they describe where you don't explicitly join etc.

```elixir MyApp.Post |> Ash.Query.filter(author.type == :admin) |> Ash.read!() ```

You can generate charts and graphs, including things like policy flow charts.

---

Ultimately I've found that modeling tools like UML that can't simultaneously actually execute that model (i.e act as the application itself) are always insufficient and/or have massive impedance mismatches once rubber meets the road. The point is to effectively reimagine this as "what if we use these modeling principles, declaratively, from the ground up".

bertails · 2 months ago

It is important in UDA for the data models to be part of the same knowledge graph as the data container representations and the mappings, and eventually the instance data too. Our metamodel Upper is strongly inspired from RDFS, SHACL, and OWL in that respect.

bertails commented on Model Once, Represent Everywhere: UDA (Unified Data Architecture) at Netflix netflixtechblog.com/uda-u... · Posted by u/Bogdanp

rorylaitila · 2 months ago

The conflict will definitely help define the terms. Maybe they will all choose "Movie", maybe not. Just there is no universally ideal term that represents a concept for all users for all time. It's a common error to seek such universal definitions.

bertails · 2 months ago

Exactly. In UDA, each Movie entity belongs to a specific business domain. Universality isn't an inherent truth, it's a social alignment within a group, useful only to the extent that it helps solve shared problems.

bertails commented on Model Once, Represent Everywhere: UDA (Unified Data Architecture) at Netflix netflixtechblog.com/uda-u... · Posted by u/Bogdanp

borromakot · 2 months ago

https://ash-hq.org

> Model your domain, derive the rest

Been doing this for 5+ years.

bertails · 2 months ago

This does look interesting. Does the Ash Framework yield a knowledge graph? How good is it a cataloging existing data containers?

bertails commented on Model Once, Represent Everywhere: UDA (Unified Data Architecture) at Netflix netflixtechblog.com/uda-u... · Posted by u/Bogdanp

rorylaitila · 2 months ago

Good luck. This is not new. Back in the Enterprise OOP era, there was a fad of developing universal data entities. Everyone eventually learned that there is no such thing as a universal entity. The semantic meaning of the data model depends on the user context, not the producer context. A "Movie" is not the same thing to the Finance team, Acquisition team, Infrastructure team, or Customer. There is not even always a common identifier, let alone common fields, let alone common meaning of the fields.

Edit: The more I read this article the more I hear this voice https://www.youtube.com/watch?v=y8OnoxKotPQ

bertails · 2 months ago

UDA does not believe in the existence of universal data entities. We embrace the idea that 2+ teams may have different opinions on how to represent the world. We are focused on the discovery of existing entities across systems and their reusability through extensibility. We believe that automation of the projections will be key for teams to align on defining some entities, where it makes sense.

bertails commented on Model Once, Represent Everywhere: UDA (Unified Data Architecture) at Netflix netflixtechblog.com/uda-u... · Posted by u/Bogdanp

alganet · 2 months ago

> ... RDF ... SPARQL ... OWL ...

I want to believe. (really! I think that's hugely underestimated tech).

bertails · 2 months ago

We joke internally that Upper is like "RDF: The Good Parts".

bertails commented on Model Once, Represent Everywhere: UDA (Unified Data Architecture) at Netflix netflixtechblog.com/uda-u... · Posted by u/Bogdanp

cletus · 2 months ago

I realize scale makes everything more difficult but at the end of the day, Netflix is encoding and serving several thousand videos via a CDN. It can't be this hard. There are a few statements in this that gave me pause.

The core problem seems to be development in isolation. Put another way: microservices. This post hints at microservices having complete autonomy over their data storage and developing their own GraphQL models. The first is normal for microservices (but an indictment at the same time). The second is... weird.

The whole point of GraphQL is to create a unified view of something, not to have 23 different versions of "Movie". Attributes are optional. Pull what you need. Common subsets of data can be organized in fragments. If you're not doing that, why are you using GraphQL?

So I worked at Facebook and may be a bit biased here because I encountered a couple of ex-Netflix engineers in my time who basically wanted to throw away FB's internal infrastructure and reinvent Netflix microservices.

Anyway, at FB there a Video GraphQL object. There aren't 23 or 7 or even 2.

Data storage for most things was via write-through in-memory graph database called TAO that persisted things to sharded MySQL servers. On top of this, you'd use EntQL to add a bunch of behavior to TAO like permissions, privacy policies, observers and such. And again, there was one Video entity. There were offline data pipelines that would generally process logging data (ie outside TAO).

Maybe someone more experienced with microservices can speak to this: does UDA make sense? Is it solving an actual problem? Or just a self-created problem?

bertails · 2 months ago

> The whole point of GraphQL is to create a unified view of something, not to have 23 different versions of "Movie".

GraphQL is great at federating APIs, and is a standardized API protocol. It is not a data modeling language. We actually tried really hard with GraphQL first.

bertails commented on Model Once, Represent Everywhere: UDA (Unified Data Architecture) at Netflix netflixtechblog.com/uda-u... · Posted by u/Bogdanp

b0a04gl · 2 months ago

how much of upper is actually enforced at runtime vs just used for schema generation? like if a downstream system silently breaks a semantic assumption (say, infers enum incorrectly or drops a type constraint), does uda catch that anywhere or is this trust-based across projections?

bertails · 2 months ago

Great question. It really depends on the projection. For example, the projections to GraphQL and Java are mostly limited to what can be expressed there. But the projection to SHACL has access to all of SPARQL Constraints, which is what's used for the bootstrapping knowledge graph. We are looking into being able to do more runtime validation for data in the warehouse.

bertails commented on Model Once, Represent Everywhere: UDA (Unified Data Architecture) at Netflix netflixtechblog.com/uda-u... · Posted by u/Bogdanp

twodave · 2 months ago

I wonder how they deal with versioning or breaking changes to the model. One advantage of keeping things more segregated is that when you decide to change a model you can do it in much smaller pieces.

I guess in their world they’d add a new model for whatever they want to change and then phase out use of the old one before removing it.

bertails · 2 months ago

> I wonder how they deal with versioning or breaking changes to the model.

Versioning is permission to break things.

Although it is not currently implemented in UDA yet, the plan is to embrace the same model as Federated GraphQL, which has proved to work very well for us (think 500+ federated GraphQL schemas). In a nutshell, UDA will actively manage deprecation cycles, as we have the ability to track the consumers of the projected models.