It appears they are re-engineering their product, as they've taken down their sign up link and their landing page now advertises an upcoming, more traditional spreadsheet product: https://subset.so/
Their docs are still up, however, and has screenshots from their old infinite canvas spreadsheet product: https://docs.subset.so/
That’s because, in part, the good stuff is made out of identifying why the wastewater fell short.
You can then try running Stable Diffusion and using that in your apps too.
These other answers, while well meaning, are for building models and not for practically working with AI. They're like being given resources on how compilers work when someone asked how to write "Hello World."
Once you get that magic feeling of having something working, you can always dig into all of the research, like https://course.fast.ai, later.
The essence of programming is data.
If you have function calls without type information, anything can happen, a function call could result in your neighbor's pool being drained, and your bank balance being sent to Belize.
In Pascal, you have to have a type before you can declare a variable. This little inconvenience saves you from an entire class of errors.
If I'm going to import a random file from the internet, I have to be sure of the type of data in it before I'm going to touch it with a 10 foot pole (or barge pole in Britain).
I had no idea people got so foolish.
Back to your idea, of course there should be type information, either as a separate file, or at the head of the file.
In Pascal, I'd have the import routine check it against the RTTI (run time type information) of the local native structure as part of the import, and throw errors if there were problems. On export, the RTTI could create the type header file/section of JSON.
JSON is a serialized JS object, which itself is untyped, so anything can be in any order. Think NoSQL databases. This simplicity gave JSON the ability to be adopted by a multitude of languages very, very quickly. Foolish, maybe, but you could build Stripe on it.
GraphQL, an emerging API standard, does feature schemas, and its rise is bringing types back to APIs. It can be a bit more work to implement than JSON, though.
It's the classic complex-simple-complex pendulum swing. We're not done, either.
If you think outside the centralized server, I could fairly quickly implement my fragment of a distributed twitter. It's a matter of declaring a few objects/types, and writing code to do CRUD for my locally hosted parts, replicate those to some publicly accessible file host/web page, and them write an engine to scan all the other sites where the people I follow publish their data.
Two things that can't be replicated:
1> blocking of users. Once data is public, you don't get it back.
2> anonymous comments or replies. This would require scanning all replies, even of people you don't follow. It's possible this could be a service from a 3rd party aggregator.
What I'm seeing most of all is a glimmer of what is possible if you don't have to worry about security, and just solve problems. The walled gardens are a result of security issues, the network effects are a result of the small number of walled gardens. If you can tell your computer to do function X with data Y, and there is NO possible way it could get hijacked or confused into doing Z, then this could work.I agree that data has to be converted, but I don’t agree that it has to be as manual as it is today. Consider this: In addition to your API, a second JSON file gets generated that describes the schema of the first. On the consuming developer’s end, their language & IDE uses that schema to let you call APIs or RPCs exactly like they were local functions.