In what way did they "release" this? I can't find it in hugging face or ollama, and they only seem to have a "try online" link in the article. "Self-sovereign intelligence", indeed.
They released it in the same sense OpenAI released GPT4. There is an online demo you can chat with, and a form to get in touch with sales to get API access
Facebook trained the model on an Internet's worth of copyrighted material without any regard for licenses whatsoever - even if model weights are copyrightable, which is an open question, you're doing the exact same thing they did. Probably not a bulletproof legal defense though.
Whether you bake the behaviour in or wrap it in an external loop, you need to train/tune the expected behaviour. Generic models can do chain of thought if asked for, but will be worse than the specialised one.
Given the name they gave it, someone with access should ask it for the “Answer to the Ultimate Question of Life, The Universe, and Everything”
If the answer is anything other than a simple “42”, I will be thoroughly disappointed. (The answer has to be just “42”, not a bunch of text about the Hitchhikers Guide to the Galaxy and all that.)
DeepThought-8B: 200,000 (based on 2020 census data)
Claude: 300-350,000
Gemini: 2.7M during peak times (strange definition of population !)
I followed up with DeepThought-8B: "what is the population of all of manhattan, and how does that square with only having 200,000 below CP" and it cut off its answer, but in the reasoning box it updated its guess to 400,000 by estimating as a fraction of land area.
I asked it "Describe how a device for transportation of living beings would be able to fly while looking like a sphere" and it just never returned an output
The reasoning steps look reasonable and the interface is simple and beautiful, though Deepthought-8b fails to disambiguate the term "the ruliad" as the technical concept from Wolfram physics, from this company's name Ruliad. Maybe that isn't in the training data, because it misunderstood the problem when asked "what is the simplest rule of the ruliad?" and went on to reason about the company's core principles. Cool release, waiting for the next update.
I found the following video from Sam Witteveen to be a useful introduction to a few of those models:
https://youtu.be/vN8jBxEKkVo
https://huggingface.co/meta-llama/Llama-3.1-70B-Instruct/blo...
Facebook trained the model on an Internet's worth of copyrighted material without any regard for licenses whatsoever - even if model weights are copyrightable, which is an open question, you're doing the exact same thing they did. Probably not a bulletproof legal defense though.
What does the (USA) law say about scraping ? Does "fair use" play a role ?
Isn't it a LLM with an algo wrapper?
Deleted Comment
If the answer is anything other than a simple “42”, I will be thoroughly disappointed. (The answer has to be just “42”, not a bunch of text about the Hitchhikers Guide to the Galaxy and all that.)
ChatGPT-o1-preview: 647,000 (based on 2023 data, breaking it down by community board area): https://chatgpt.com/share/674b3f5b-29c4-8007-b1b6-5e0a4aeaf0... (this appears to be the most correct, judging from census data)
DeepThought-8B: 200,000 (based on 2020 census data) Claude: 300-350,000 Gemini: 2.7M during peak times (strange definition of population !)
I followed up with DeepThought-8B: "what is the population of all of manhattan, and how does that square with only having 200,000 below CP" and it cut off its answer, but in the reasoning box it updated its guess to 400,000 by estimating as a fraction of land area.