The second problem is distribution: it is already hard enough to obtain good enough distribution with software, let alone software + hardware combinations. Even large silicon companies have struggled to get their HW into products across the world. Part of this is due to the actual purchase dynamics and cycle of people who buy chips, many design products and commit to N year production cycles of products built on certain hardware SKUs, meaning you have to both land large deals, and have opportune timing to catch them when they are evening shopping for a new platform. Furthermore the people with existing distribution i.e the Apple, Google, Nvidia, Intel, AMD, Qualcomms of the world already have distribution and their own offerings in this space and will not partner/buy from you.
My framing (which has remained unchanged since 2018) is that for silicon platform to win you have to beat the incumbents (i.e Nvidia) on the 3Ps: Price (really TCO), Performance, and Programmability.
Most hardware accelerators may win on one, but even then it is often theoretical performance because it assumes their existing software can/will work on your chip, which it often doesn't (see AMD and friends).
There are many other threats that come in this form, for example if you have a fixed function accelerator and some part of the model code has to run on CPU the memory traffic/synchronization can completely negate any performance improvements you might offer.
Even many of the existing silicon startups have been struggling with this since them middle of the last decade, the only thing that saved them is the consolidation to Transformers but it is very easy for a new model architecture to come out and require everyone to rework what they have built. This need for flexibility is what has given rise to the design ethos around GPGPU as flexibility in a changing world is a requirement not just a nice to have.
Best of luck, but these things are worth thinking deeply about as when we started in this market we were already aware of many of these things but their importance and gravity in the AI market have only become more important, not less :)
Everyone is trimming down their training data based on quality - there are plenty of hints about that in the Llama 3.1 paper and Mistral Large 2 announcement.
OpenAI are licensing data from sources like the Associated Press.
Andrej Karpathy said this: https://twitter.com/karpathy/status/1797313173449764933
> Turns out that LLMs learn a lot better and faster from educational content as well. This is partly because the average Common Crawl article (internet pages) is not of very high value and distracts the training, packing in too much irrelevant information. The average webpage on the internet is so random and terrible it's not even clear how prior LLMs learn anything at all.
This is understood in the academic literature as well, as people months/years ago were writing papers that a smaller amount of high quality data, is worth more than a large amount of low quality data (which tracks with what you can pick up from an ML 101 education/training).
A one-page alphabet reference chart would be enough to remind the reader which letter is which without relying on the romanization crutch.
Normally I don't like to make argumentative internet comments but I really passionately think romanization is a detriment to a learning tool.
Given how business leaders throughout tech feel that AI is going to be transformative, I don't think commitment is really going to be a problem. Many leaders feel that "you either get good at AI or you don't exist in 10 years".
In terms of attracting talent, there are 3 main things top AI folks look for:
1. Money (they are people after all)
2. The infrastructure (both hardware and people/organization-wise) to support large AI projects.
3. The willingness to release these AI projects to a large swath of people (to have "impact" as folks like to say).
E.g. Google had 1 and 2 but their reticence to release their models and corporate infighting made many of the top Google researchers leave for gigs elsewhere. I think it remains to be seen how Apple will handle #3 as well.
Siri is sort of a red herring because its built by teams and tech that existed before Apple acquired most of its ML talent and some of its inability to evolve has been due to internal politics not the inability to build tech. iOS 17 is an example of Apple moving towards more deep learning speech/text work. I would bet heavily we will see them catch up with well integrated pieces as they have Money, infra, and already the ability to go wide (i.e all iOS users, again think FaceID).
- Android, Windows, Linux and MacOS can already run local and private models just fine. Getting something product-ready for iPhone is a game of catch-up, and probably a losing battle if Apple insists on making you use their AI assistant over competing options.
- The software side needs more development. The current SOTA inferencing techniques for Apple Silicon in llama.cpp are cobbled together with Metal shaders and NEON instructions, neither of which are ideal or specific to Apple hardware. If Apple wants to differentiate their silicon from the 2016 Macbooks running LLaMA with AVX, then they have to develop CoreML's API further.
Stimulants work for a large number of people diagnosed with ADHD with very little negative effects and are safe modulo a few exceptions for long term use.
Some individuals have negative experiences with Stimulant medications but I know from personal experience and from many friends in the ADHD community stimulants have literally been life saving for them.
They don't just reach for them because they are out to get you but they are effective for many people.
Furthermore many people who choose to forgo medication develop lifestyle and substance use issues which negative effects far out weigh low dose stimulants.
As other commenters said they are just a tool you still have to work on interventions, behavior modification, and so on.
At the end of the day in many ways ADHD is a disability (even if sometimes a super power) and you can't just delete it with a prescription.
Even if you forgo meds there are so many ways to boost your attention and quality of life and lots of research on what is effective, treatment can be much more than just medication.
The internet is making everyone productivity addicts, can't just live anymore and be different.
If the author does not bring ADHD in the conversation, I think it's rude to find him a condition on the back of his post, which is far more interesting than isolating a paragraph and pulling a diagnosis on him.
It's a great blog post.
On this part in particular while it can be great to deeply follow your passion with extreme focus; pursuing things regardless of their importance in your overall life and at the cost of other interests, relationships, or responsibilities can be an empty and unfulfilling existence in the end. Furthermore life can be markedly better with the correct interventions and treatment.
People seemingly get offended by even a suggestion because many people have extreme stigma against conditions like ADHD as well as a lot of misinformation from people have very little understanding of the actual traits, diagnostic criteria, treatment and prevalence of it.