They timed their marketing copy very poorly. Virtually everyone is talking about how long before chat GPT renders their jobs obsolete, and these guys are saying “AI’s impact has been too limited” and their tech is going to help it “really” take off.
> The world needs a new approach to enable the next wave of AI innovation
Maybe you could have said that this time last year, but kinda silly to say when we're only several months into unprecedented mainstream AI innovation that keeps getting one-upped every other week.
We are still without autonomous cars, global food security, accessible preventative medicine, reliable drug discovery, or robotic automation at home, among so much more.
Why do people always bring up this shit? Solutionism…Sure it will solve problems but it’s going to create a lot of issues too. Imagine the mass surveillance, robotic enforcement of dissidents etc.
People already know what needs to be done to solve a lot of these problems we have, it’s called sharing and compassion. It’s not robotics.
I’m serious they the likely outcome of all this tech is going to be the matrix, so people can imagine their egos get as big as they like while being drip fed vitamins. One can only hope I guess.
Objectively it's a failure, since no one programs deep learning applications in Swift for TensorFlow, but it's cool to see that parts of it still live on.
I don't think a pre-launch marketing page has ever survived the gauntlet of the HN front page without plenty of critics. Anything more specific you don't like?
I personally liked how the video showed real user complaints about the AI/data programming stacks via screenshots of Twitter/Reddit. A lot of startups fail to connect what they are doing to real life pain-points by real people. Otherwise the video communicates the gist better than the website copy, which is more generic.
A lot of people tend to dump grand marketing-speak on landing pages when it should look more like their pitch decks, basically:
- here's the problem
- here's how we hope to solve it
- here's the basic plan/timeline to accomplish it
These guys a hyping up some launch event so maybe they are waiting for that but I usually don't hold much hope for marketing pages.
I see a lot of negativity about a product that isn't even clear yet, which is a shame. There are a lot of problems with AI development today, and if Modular fixes even a slice of issues, then that's pretty great. I have some skepticism too, but in the spirit of optimism, here's a list of things that make AI development difficult. Maybe they will fix some:
1. The iteration cycles are really slow. Waiting for models to train is O(days) for the most common models that can still do something useful.
2. It's very hard to predict whether a change will make your model better or worse. You typically have to try a bunch of things out the whole way through. Even a 10% idea success rate is pretty good, but it takes a lot of time and GPUs to find out which ideas do work.
3. The programming of a training pipeline is really error prone. The most common error is that your tensors are different sizes, but even the bolt-on static type systems don't help you prevent tensor size mismatch.
4. It's true that lock-in is pretty bad. If NVidia had real competition then maybe prices would come crashing down.
5. If the set of primitives in pytorch or tensorflow work for you, then great! But if you need new primitives, then you're crossing the Python-C++ boundary, learning CUDA, and also grappling with build systems (either praying that Python will find CUDA, or learning more than you ever wanted to know about Bazel).
6. All the data prep and final analysis tends to need specialized scripts and pipelines. All these tools are written by people who just want to get back to the model, and are run infrequently. You're lucky if the scripts still work today, and especially lucky if there are any tests at all.
Unified, layered architecture. Good luck with that. They will have hard time convincing anyone to implement their interfaces. Which means they will have to do all modules themselves. Somehow I don't believe in this idea...
Maybe that's the point? Their blog post didn't explain exactly how they plan to deal with this issue. As others have mentioned when their blog post was on HN it was missing the 'conclusion', which is maybe what they are hyping up for their launch? But otherwise it seems at a minimum they are aware of this problem.
> But as you can see, solving this problem is not easy. Diversity in hardware, models, and data means that every existing solution on the market is only just a “point solution” to a much broader problem.
It looks like they took a profiler to the training process of today's largest AI models.
Firstly it'll be a universal low level software layer, that can run on top of any cloud hardware, which can then be developed against to enable maximal cloud reach when training models.
This sort of virtualisation might be considered slower than direct access, but they've also created a DSL on top of python, which looks to enable the compiler to make smarter decisions about how to allocate memory and compute during training. So both together presumably producing a speedup worthy of the hype.
Okay.
> The world needs a new approach to enable the next wave of AI innovation
Maybe you could have said that this time last year, but kinda silly to say when we're only several months into unprecedented mainstream AI innovation that keeps getting one-upped every other week.
(A lot of) Modern AI doesn't yet seem to grasp the top down structure of the things that it deals with.
It's become deceptively good at producing bottom up models that preserve local invariants, but it still does this: https://www.bing.com/images/create/hands/64360ff34c944fa98e7...
(And for fuck's sake Bing, give me a shorter link than that.)
Why do people always bring up this shit? Solutionism…Sure it will solve problems but it’s going to create a lot of issues too. Imagine the mass surveillance, robotic enforcement of dissidents etc.
People already know what needs to be done to solve a lot of these problems we have, it’s called sharing and compassion. It’s not robotics.
I’m serious they the likely outcome of all this tech is going to be the matrix, so people can imagine their egos get as big as they like while being drip fed vitamins. One can only hope I guess.
Broke: Making the world a better place through the power of friendship
Woke: matmul go brrrrr
Also: https://www.youtube.com/watch?v=B8C5sjjhsso
Deleted Comment
Interesting, I wonder if Chris Lattner learned from the failure of Swift for TensorFlow project that he lead.
https://news.ycombinator.com/item?id=35279274
I dunno, it is hard to say after just blog posts, anyway they have a clock on the site so I guess we’ll see in about 20 days how things go.
I personally liked how the video showed real user complaints about the AI/data programming stacks via screenshots of Twitter/Reddit. A lot of startups fail to connect what they are doing to real life pain-points by real people. Otherwise the video communicates the gist better than the website copy, which is more generic.
A lot of people tend to dump grand marketing-speak on landing pages when it should look more like their pitch decks, basically:
- here's the problem
- here's how we hope to solve it
- here's the basic plan/timeline to accomplish it
These guys a hyping up some launch event so maybe they are waiting for that but I usually don't hold much hope for marketing pages.
1. The iteration cycles are really slow. Waiting for models to train is O(days) for the most common models that can still do something useful.
2. It's very hard to predict whether a change will make your model better or worse. You typically have to try a bunch of things out the whole way through. Even a 10% idea success rate is pretty good, but it takes a lot of time and GPUs to find out which ideas do work.
3. The programming of a training pipeline is really error prone. The most common error is that your tensors are different sizes, but even the bolt-on static type systems don't help you prevent tensor size mismatch.
4. It's true that lock-in is pretty bad. If NVidia had real competition then maybe prices would come crashing down.
5. If the set of primitives in pytorch or tensorflow work for you, then great! But if you need new primitives, then you're crossing the Python-C++ boundary, learning CUDA, and also grappling with build systems (either praying that Python will find CUDA, or learning more than you ever wanted to know about Bazel).
6. All the data prep and final analysis tends to need specialized scripts and pipelines. All these tools are written by people who just want to get back to the model, and are run infrequently. You're lucky if the scripts still work today, and especially lucky if there are any tests at all.
> But as you can see, solving this problem is not easy. Diversity in hardware, models, and data means that every existing solution on the market is only just a “point solution” to a much broader problem.
https://www.modular.com/blog/ais-compute-fragmentation-what-...
Firstly it'll be a universal low level software layer, that can run on top of any cloud hardware, which can then be developed against to enable maximal cloud reach when training models.
This sort of virtualisation might be considered slower than direct access, but they've also created a DSL on top of python, which looks to enable the compiler to make smarter decisions about how to allocate memory and compute during training. So both together presumably producing a speedup worthy of the hype.
Kudos to them if they deliver on their promise.