I learned a lot about how important not just superior technology, but better operations, marketing, sales, and culture are all critical to a successful business.
The only con of this book is that it skips over some parts of nvidia’s history like the short-lived crypto boom, failed acquisition of ARM, etc. It’s still just a minor flaw in an otherwise great book though.
Snow Crash in particular uses the metaverse mostly as an excuse to include sword fights and motorcycle/monorail chase scenes. Fantastically fun in my opinion! But that motivates all kinds of choices (making the Internet a literal place with a street grid and real estate, where people can get chopped up by swords) that real XR tech has no particular technical use for. And I would implore any engineers using Snow Crash as an inspiration to consider how absurd it would be to take all the world's most sophisticated technology and dedicate it to gratifying personal power fantasies. It starts with the almost pornographic display of advanced weaponry and logistics deployed to deliver a pizza, and just gets more gloriously ridiculous from there. The main character is named "Hero Protagonist". The main antagonist has a nuke strapped to his motorcycle. Take a hint!
Anyway, I am happy Diamond Age gets a call-out because it is by far my favorite of Stephenson's novels. And I think the Young Ladies Illustrated Primer is one of the all-time most interesting technological plot contrivances (the Imago machine, game/civilization of Azad, and Shrike all providing strong competition). But the technical constraints/capabilities of the Primer have almost nothing to do with realistic limitations/advantages of AI technology, and everything to do with getting the right characters into the right places at the right times. We need a Miranda to provide the Primer's voice so that Nell can have some kind of human connection in the end, and we need Miranda to be paid anonymously so that Nell won't get that human connection too soon. The Primer is a language model not a robot so that Nell will have to solve problems on her own. Yet she can learn Kung Fu from a language model because we need a few action scenes. I think really the interesting question posed is "Can a person grow up to be influential given no resources except a perfect education?" not so much "Can a language model provide a perfect education?". Many characters in Diamond Age seem to agree with the former notion, but in the end it (SPOILER) gets shot down when Nell needs control of an literal army to come out on top.
The purpose of sci-fi IMO is moreso to:
1. Provide an entertaining story/narrative with technology as the main focus of the world and characters' actions
2. Define a set of concepts to help you think about technology and its possible effects on humans and the world
3. Nudge people to think about what kind of future they would want or not want and how they can use or control technology to achieve that
Here's Ken Liu talking about the purpose of sci-fi: https://www.youtube.com/watch?v=5knkpmxXu-k
1. definitely training data (for me), we explored about 10 different directions before settling on the current approach. It's easy to underestimate the effect of training data on the quality of the model. Starting point was the benchmark dataset though, which we assembled manually (to avoid data pollution and also because there was simply no text2sql benchmark that covers anything else than plain old SQL select statements with a handful of aggregate functions). And training is also not a one-off thing. With large datasets it is hard to evaluate the quality of the dataset without actually training a few epochs on it and run the benchmark.
2. I left a comment about my view on where such models are effective in a previous commment: https://news.ycombinator.com/item?id=39133155
3. No way - I see a common stack emerging (take a look at companies like https://vanna.ai/, https://www.dataherald.com/, or https://www.waii.ai) that is mainly centered around foundation models like GPT-4 with strong in-context learning capabilities (that's a kind of a must to make these approaches work and comes with long inference times and higher costs). These solutions include things like embedding-based schema filtering, options for users to enrich metadata about tables and columns, including previous related queries into the context etc. around the model. I'd say it's a bit of a different problem from what we aimed at solving.
In that sense I emphasized in our Blogpost that users should think of it as a documentation oracle that always gives you the exact DuckDB SQL query snippet you are looking for, which is a tremendoues time-saver if you have an abstrat idea of the query you want to write, but you're just not sure about the syntax, expecially with DuckDB having so many functions and SQL extensions.
Here are a few exammples:
- create tmp table from test.csv
- load aws credentials from 'test' profile
- get max of all columns in rideshare table
- show query plan with runtimes for 'SELECT * FROM rideshare'
- cast hvfhs_license_num column to int
- get all columns ending with _amount from taxi table
- show summary statistics of rideshare table
- get a 10% reservoir sample of rideshare table
- get length of drivers array in taxi table
- get violation_type field from other_violations json column in taxi table
- get passenger count, trip distance and fare amount from taxi table and oder by all of them
- list all tables in current database
- get all databases starting with test_
[edit: fixed list formatting]