Update on Llama adoption

Has anyone heard about any effect Meta has said would happen if Californias SB 1047 passes[1]?

Looking forward to continued updates and releases of Llama (and SAM!) from Meta.

[1] https://www.theverge.com/2024/8/28/24229068/california-sb-10...

malwrar · a year ago

I haven’t heard anything specific from Meta themselves, but I think the bill is short enough that we can reason as non-lawyers about it. Almost certainly they would have to stop releasing LLMs weights based on the very specific qualifications in the legislation. I don’t actually know what the specific size limit would be, but based on the translated $ value in the text of the bill it probably would cover their 70B+ models.

Disturbing approach to mitigating AI harms imo, this bill basically hopes it can limit the number of operators of an arbitrary model type so as to allow easier governance of AI model use. This ignores the reality that we already have large models openly released and easily modifiable (outside CA jurisdiction) which likely are capable of perpetuating “critical harms”, or that the information requirements to achieve the defined “critical harms” could be realized by an individual by simply reading a few books. There’s also no reason to simply assume that future models will require millions of dollars of compute to create; fulfilling the goals of this regulatory philosophy long term almost certainly requires the banning of general purpose compute to come close to the desired outcome of a supposed reduction in probability of some “critical harm” being perpetrated. We should be focusing on hardening society to the realities of the “critical harms” identified by this bill, rather than implicitly assuming the only reason we don’t see them as much irl is because everyone is stupid. The current paranoia wave around LLMs is just a symptom of people waking up to the fragility of the world we live in.

ksajadi · a year ago

This is a good one for everything related to SB1047 https://pca.st/episode/44b41e28-5772-41c4-bcd7-5d7aa48d5120

pama · a year ago

Yoshua Bengio is a very respected scientist with well deserved reputation, but this discussion is upsetting… “academia now trains much smaller models… 10^26 FLOPs is 10 to the 26 floating point operations per second.. yes.. how big is that compared to GPT-4? It is much bigger than all the existing ones…” (flops has a different meaning: there is no per second in the law; one single H100 from last year performs 1e15 FLOPs per second; llama3.1 was close to the 1e26 limit this year, and the total training FLOPS of other models are not published; research could change once compute is even cheaper but state laws move at glacial speeds…).

It is disheartening so see the damage capacity in the hands of a couple of paranoic people who perhaps read the wrong scifi and had lots of power to influence others. If California passes this law, in a few years the world economy will be very different.

trissi1996 · a year ago

IMHO it's not, it just parrots the same old arguments for "safety", arguing against straw-men and framing the other side as having wrong assumptions about AI safety/being unfair/etc, all while not going into the principled counter-arguments and their own assumptions at all.

Here are some counter-points:

Regulation:

- Very little effort is made to evaluate risk of over-regulating, regulatory capture and counterproductive wrong regulation

- The downside of under-regulating is vastly overemphasized, most arguments boil down to "we have to act FAST now or x BAD thing might happen"

- The risk of over/wrongly regulating is vastly under emphasized with the same FUD reasons.

- according to one of the many straw-men argument in the pod I'm a libertarian against any and all regulation because I criticize possible regulatory capture, I would enthusiastically support regulation that foundation models have to be:

-- given freely to public researchers/academics for in-depth independent safety-research

-- open weighted after a while (e.g. after ~ a year, safety concerns should be mostly ruled out and new generations are out so ROI is already likely there. [e.g. there's NO safety reason at all for ClosedAI to not release gpt-3.5, llama3 is better already])

Proposed FLOP cut-off of SB 1047:

- according to the pod, the cut off is much more advanced than anything currently released.

- The 10^26 FLOP cutoff is way to low, llama-405b is ~4×10^25 FLOPs

- 405B is maybe 20% smarter than 70B, while taking over an order of magnitude more FLOPS to train, the cutoff itself is very likely not much smarter than the current SOTA.

- IMO none of the current SOTA models are very dangerous, but kill switch regulation is.

Kill-Switches:

- SB 1047 is (non-explicitly) calling for kill-switches over the cut-off due to liability of the model creators and market dynamics

- Any kill-switch regulation means a complete dead-end to any advanced open-weights AI. This means that huge corporations and governments will control any and all advanced AI-development. This is top-down control of the maths you are allowed to run on your computer IMO that is Orwellian as fuck.

China:

- mentioning china is FUD 101, it's basically AI's "think of the children"

- If they think they can stop china from building their own advanced LLMs, they're delusional. This regulation might even help them to get there faster. They don't even need to steal, there maybe a year or two behind the SOTA and catching up fast.

I just don't get how so many people on a site with "hacker" in the name want to make it impossible to hack on these things for anyone not employed by the big corporate AI research labs.

Llama isn't open source at all. Stop using that phrase for your product featuring even an EULA.

JumpCrisscross · a year ago

> Llama isn't open source at all. Stop using that phrase for your product featuring even an EULA

We don't have a commonly-accepted definition of what open source means for LLMs. Just negating Facebook's doesn't advance the discussion.

The open-source community is fractured between those who want it to mean weights available (Facebook); weights and transformer available; weights with no use restrictions (I think this is you); and weights, transformer and training data with no restrictions (obviously not workable, not even the OSI's proposed definition goes that far [1]).

In a world where the only LLMs are proprietary or Llama and the open-source community either remains fractured or chooses an unworkable ask, the latter wil define how the term is used.

[1] https://opensource.org/deepdive/drafts/open-source-ai-defini...

j_maffe · a year ago

The Llama 3.1 transformer is available. But it does have some minor use restrictions, yes.

lrrna · a year ago

We have that definition. The user needs the complete capability to reproduce what is distributed. Which means training data and the source used to train the model.

If you distribute the output of bison (say, foo.c) and not the foo.y sources, you would get pushback.

Then there is the EULA which makes it closed source right from the start.

riedel · a year ago

This a hard stance if you talk just about the code (not the model weights). The llama community licence is a bit weird and probably not an OSI compliant licence, but close. Regarding weights this is different, but to me it is actually difficult to understand still now to aplly copyright law here. Having said that one nicht und erstand why certain stupid looking clauses went into the code licence. If we do not understand copyright of model weights and do not have court rulings in the use oft training data und er different copyright regimes (US and EU), I would not care too much. We are still in the Wild West.

koolala · a year ago

Model Weights are not the Source. Why can't that be obviously like a binary isn't source code - a binary is compiled from source. You can open-license the data in a binary so it can be reverse-engineered / modded but that doesn't make it open source.

tourmalinetaco · a year ago

“Close” is not good enough for using a term with a very specific meaning. OSI = open source, everything else is source-available (which its arguable that either even applies, because the source of the weights, the dataset, is not available).

I agree that for Llama, things are weird and they want to cover their bases, and that its better than nothing, but the specific use of “open source” is a long-running corporate dilution of what open source really means and I am tired of it.

owlbite · a year ago

I feel the arbitrary split between code and weights makes little sense when discussing if these models are "open source" in the copyleft meaning of the term. If the average user can't meaningfully use your product without agreeing to non-free terms then it's morally closed source.

Anything else and you're just open-source-washing your proprietary technology.

segmondy · a year ago

Having access to a weight doesn't make it open. Else you can make the argument that Microsoft Word is open source because you have access to the binary.

Deleted Comment

JackYoustra · a year ago

Fwiw Meta, under oath in congressional session, called Llama not open-source.

KolmogorovComp · a year ago

source and context? Would be interested to know more about this

insane_dreamer · a year ago

The OSI doesn't have a monopoly on the definition of the words "open source". There is "open source as per the OSI Open Source Definition" and there are other interpretations.

intellectronica · a year ago

Right, but if your "open source" package doesn't include .... the source, then you need some other definition.

Der_Einzige · a year ago

Why should anyone care about following a license?

Llama did not license its training data. It’s almost impossible to prove a particular LLM was used to generate any particular text, and there’s likely a bunch of illegal content within the dataset used to train the model (as is the case for most other LLMs)…

So why should I care about following a license? They have no mechanism to enforce it. They have no mechanism to detect when it’s being violated. They themselves indicated hostilities to other licenses, so why not ignore it?

tobyjsullivan · a year ago

Trap streets

https://en.wikipedia.org/wiki/Trap_street

OKRainbowKid · a year ago

Thousands of lawyers and billions to spend.

jrm4 · a year ago

Obligatory "Stallman Was Right."

Once again: for those who are new here. There is Free Software, which has a usefully strict definition.

And there is Open Source, the business-friendly -- but consequently looser -- other thing.

You can like one or both, they both have advantages and drawbacks.

But you cannot insist that "Open Source" has a very strict definition. It just doesn't. That's why the whole Free Software thing is needed, and IMHO, more important.

HarHarVeryFunny · a year ago

Sure, but you can't give something away for free that you don't own. What people complaining about LLama not being open source are talking about is the training data, and that isn't something that Meta owns for the most part.

ensignavenger · a year ago

Open Source has every bit as strict of a definition as Free Software. Open Source as a term was coined and popularized by the OSI. The term may have occasionally been used in different contexts prior to the OSI, but it was never commonly applied to software before that.

One could argue that the OSI should have gotten a trademark on the term. But the FSF doesn't have a trademark on the term "free Software" either, so the terms have approximately equal legal protections.

Meta using the term "open source" to apply to their model data when their license isn't an open source license is dishonest at best.

j_maffe · a year ago

I agree except for your opinion in the end. But I know you're not alone in this opinion and it has been discussed to death. At this point it's more political than anything else in CS.

ekianjo · a year ago

open source is so ambiguous its a useless expression at this stage. At least FOSS is less problematic.

kzrdude · a year ago

https://opensource.org/osd

benterix · a year ago

A few decades ago an organization was founded specifically to address statements such as this one. That's why some early Microsoft attempts at competing with OS had to be called "shared source", not "open source".

pj_mukh · a year ago

thor-rodrigues · a year ago

I think that focusing primarily on the discussion of what is or isn't open source software makes us miss an interesting point here, that Llama enables users to have a similar performance to frontier models in your own systems, without having to send data to third-party sources.

My company is building an application for an university client, regarding the examination of research data written in "human language" (mostly notes and docs).

Due the high confidentiality of the subjects, as often they deal with non-patented information, we couldn't risk using frontier models, as it could break the novelty of the invention, therefore losing patentability.

Now with Llama3.1, we can simply run these models locally, on systems that is not even connected to the internet. LLMs are mostly good in examining massive amount of research papers and information, at least for the application we are aiming at, saving thousands of hours of tiresome (and very boring) human labour.

I am trying to endorse Meta or Zuckerberg or anything like that, but at least in this aspect, I think Llama being "open-source" is a very good aspect.

jstummbillig · a year ago

To me it's fairly interesting how relatively little money it takes meta to pose a risk to other models makers businesses, who are dependent on having to run the model after they created it (because that is how they make money) while meta does not even have to deal with the cost attached to providing inference infra, at all, to pose that risk.

phyrex · a year ago

That's a funny definition of "little money"

Can you imagine how incredible an open source model would be for research / humanity beyond the buisness needs right in front of us?

Open-Knowledge source with an Open-Inteligence that can guide you through the entire massive digital library of its own brain. Semantic data light-years beyond a Search Engine.

valine · a year ago

If you have the model weights you have roughly the same opportunities as the company that trained the model. The code you need to run inference on the Llama weights is very much open source. The only thing you're missing out on is the training code, which is prohibitively expensive to run for most anyways. Open source training isn't going to give you any unique insights into the "digital brain library" of your model.

Also just to be clear, if you want to set up a RAG with an open weight model and a large dataset there's nothing stopping you. Download Red Pajama and Llama and give it a try.

https://github.com/togethercomputer/RedPajama-Data

> Open-Knowledge source with an Open-Inteligence that can guide you through the entire massive digital library of its own brain. Semantic data light-years beyond a Search Engine

This sounds like the usual AI marketing with the word "open" thrown in. It's not articulating something youc an only do with an open source LLM (and doesn't define what that means).

I'm personally not thrilled with how locked down LLMs are. But we'll need to do a better job at articulating (a) a definition and (b) the benefits of adhering to it versus the "you can run it on your own metal" definition Facebook is promulgating. Because a model meeting Facebook's definition has obvious benefits over proprietary models run on someone else's servers.

You're not really asking for an open source model though, you're asking for open source training data set(s), which isn't something that Meta can give you. There are open source web scrapes such as The Pile, but much of the more specialized data needs to be licensed.

All of that data is already available, just look into “shadow libraries”. Now, I do wish Meta and other companies would publish their data sets and we, as humanity, could improve upon them and empower even better LLMs, but the unfortunate reality is copyright is holding us back. Most of what you say is essentially gibberish, but there is truth that LLMs would be better if it could not only utilize its weights, but reference and search its training data (that is collectively owned by humanity, by the way) and answer with that and not just what it “thinks”.

talldayo · a year ago

No, I really can't imagine it. Extrapolating from our free commercially-licensed offerings it would seem most people would ignore it or share stories on Reddit about how FreeGPT poisoned their family when generating a potato salad recipe.

honorious · a year ago

Can you expand on the risk of breaking novelty?

Is the concern that prompts could be re-used for training by the provider and such knowledge become part of the model?

mkesper · a year ago

nikolayasdf123 · a year ago

LLAVA is pretty great

josefresco · a year ago

It is! Just downloaded it the other day and while far from perfect it's pretty neat. I uploaded a Gene Wilder/Charlie in the Chocolate Factory meme and it incorrectly told me that it was Johnny Depp. Close I guess! I run LLAVA and llama (among other models) using https://ollama.com

As a "web builder" I do think these tools will be very useful for accessibility (eventually), specifically generating descriptive alt tags for images.

simonw · a year ago

Do you know of any good paid API providers offering LLAVA? I want to experiment with it out a bunch more without having to host it locally myself.

xyc · a year ago

Cloudflare has it https://developers.cloudflare.com/workers-ai/models/llava-1....

Locally it's actually quite easy to setup. I've made an app https://recurse.chat/ which supports Llava 1.6. It takes a zero-config approach so you can just start chatting and the app downloads the model for you.

bfirsh · a year ago

https://replicate.com/yorickvp/llava-13b :)

nope. I am self-hosting. support is pretty good actually. llama.cpp supports it (v1.6 too; and in openai API server as well). ollama supports it. open-web-ui chat too.

using it now on desktop (I am in China, so no OpenAI here) and in cloud cluster on project.

nuz · a year ago

What are you using it for? Curious if there's any interesting purposes I haven't thought of

codingwagie · a year ago

Probably will get flagged, but I get so annoyed by the cynical takes on Meta and their open source strategy. Meta is the only company releasing true open source (React, pytorch, graphql) and now LLama. This company has done more for software development than any other in the last decade. And now they are burning down the competition in AI, making it accessible to all. Meta software engineering compensation strategy pushed up the high end of developer compensation by almost twice. Enough with the weird cynicism on their licensing policy.

SushiHippie · a year ago

The llama models use a non open source license [0].

Yes it is still better than not being able to access the weights at all, but calling these weights open source is not correct.

[0] https://huggingface.co/meta-llama/Meta-Llama-3.1-8B/blob/mai...

Dude they spent billions on the model and then just open sourced it

bschmidt1 · a year ago

React? Surely I'm not the only one who remembers https://news.ycombinator.com/item?id=15050841

I don't think Facebook/Meta is the beacon of open-source goodness you think it is. The main reason they created yarn instead of iterating on npm is to use their own patent-friendly license they wanted to use with React (before the community flipped out and demanded they re-license it as MIT). Early Vue adoption seemed mostly driven by that React licensing fiasco.

Dead Comment

True? The cynicism is arguments over "true". If you trick yourself into believing this is what Open Source looks like (no source data) then you lose out on imagining what a real open source AI with open source data would be like.

> you lose out on imagining what a real open source AI with open source data would be like

Zero privacy?

YetAnotherNick · a year ago

Open source and open source model is not a term that came from dictionary, and it is just based on what community thinks it means. As long as open source model doesn't cause confusion, which it does not as open source model today just means open weights model, fighting over it is not worth it.

wilsonnb3 · a year ago

> Meta is the only company releasing true open source

What? There are so many open source projects from huge companies these days.

VSCode, .NET, typescript from MS

Angular, flutter, kubernetes, go, android, chromium from Google

gwern · a year ago

There is nothing weirdly cynical about it. This is a fact of life in Silicon Valley - that a lot of FLOSS is released for strategic reasons (such as building up a community before enclosing it to extract a profit), and not because the Grinch's heart grew 2 sizes one day. "Commoditize your complement": https://gwern.net/complement

You can benefit a lot from it, and I have... but do be sure you know what you are ferrying on your back before you decide to offer it a ride across the river.

> a lot of FLOSS is released for strategic reasons (such as building up a community before enclosing it

Not only is "a lot" of FOSS not released like this, both free software and Meta's models cannot be monetized post-release. If Meta decides to charge money for Llama4, then everyone with access to the prior models can keep their access and even finetune/redistribute their model. There is no strategic flip Meta can attempt here without shotgunning their own foot off.

xgb84j · a year ago

How do you think Meta profits off React and PyTorch? Just marketing to get good candidates?

cmur · a year ago

if something requires an EULA it isn’t open at all, it is just publicly available. By your logic, public services are “open source.” There are myriad corporations that release actual open source software that is truly free to use. If you experience massive success with anything regarding Meta’s LLMs, they’re going to take a cut according to their EULA.

eduction · a year ago

You’re certainly entitled to the opinion that an agreement (as in EULA) is distinct from a license (as in GPL, MIT etc).

But many legal minds close to this issue have moved to the position that there is no meaningful distinction, at least when it comes to licenses like GPL.

For example: https://writing.kemitchell.com/2023/10/13/Wrong-About-GPLs

bunderbunder · a year ago

I'm trying to figure out the logic that makes "free for commercial use with less than 700 million monthly active users" less open than "free for non-commercial use", which is the traditional norm for non-copyleft open source machine learning products. But I just can't get there. Could somebody spell it out for me?

blackeyeblitzar · a year ago

Here we go again with the co opting of open source and the marketing open washing. Llama isn’t open source. Sharing weights is like sharing a compiled program. Without visibility into the training data, curation / moderation decision, the training code, etc Llama could be doing anything and we wouldn’t know.

Also open source means the license used should be something standard not proprietary, without restrictions on how you can use it.

> Sharing weights is like sharing a compiled program.

Not at all. They're only similar in the sense that both are a build artifact.

> Without visibility into the training data, curation / moderation decision, the training code, etc Llama could be doing anything and we wouldn’t know.

"could be doing anything" is quite the tortured phrase, there. For one, model training is not deterministic and having the full training data would not yield a byte-perfect Llama retrain. For two, the released models are not turing-complete or filled with viruses; you can open the weights yourself and confirm they're static and harmless. For three, training code exists for all 3 Llama models and the reason nobody uses them is because it's prohibitively expensive to reproduce and has zero positive potential compared to finetuning what we have already.

> Also open source means the license used should be something standard not proprietary, without restrictions on how you can use it.

There are very much restrictions on redistribution for nearly every single Open Source license. Permissive licensing may not mean what you think it means.

dakial1 · a year ago

...says the owner of it.

Now seriously, by Llama being "sort of" open source, it does not seem to be something someone can fork and develop/evolve it without Meta, right? If one day Meta comes and says "we are closing Llama and evolving it in a proprietary mode from now on" would this Llama indie scene continue to exist?

If this is the case wouldn't this be considered a dumping strategy by Meta, to cut revenue streams from other platforms (Gemini/OpenAI/Anthropic) and contain their growth?

RicoElectrico · a year ago

The models can be fine-tuned which is good enough.

islewis · a year ago

"good enough" is incredibly subjective here. Maybe good enough for you, but there are many things that are not possible with either the dataset or the weights being available.

Good-enough? Please please type out what a truely open source ai model with open weights and open data would be like. I picture it like a Tower of Babel! Very far from "Good-enough"!

mupuff1234 · a year ago

Not good enough to be considered open source.