State-of-the-art open-source chatbot, Vicuna-13B, just released model weights

Note that what they released are the delta weights from the og LLaMa model. To play around with it, you'll need to grab the original LLaMA 13B model and apply the changes.

  > We release Vicuna weights as delta weights to comply with the LLaMA model
  > license. You can add our delta to the original LLaMA weights to obtain
  > the Vicuna weights.

Edit: took me a while to find it, here's a direct link to the delta weights: https://huggingface.co/lmsys/vicuna-13b-delta-v0

superkuh · 3 years ago

That's what they say but I just spent 10 minutes searching the git repo, reading the relavent .py files and looking at their homepage and the vicuna-7b-delta and vicuna-13b-delta-v0 files are no where to be found. Am I blind or did they announce a release without actually releasing?

zhwu · 3 years ago

If you follow this command in their instruction, the delta will be automatically downloaded and applied to the base model. https://github.com/lm-sys/FastChat#vicuna-13b: `python3 -m fastchat.model.apply_delta --base /path/to/llama-13b --target /output/path/to/vicuna-13b --delta lmsys/vicuna-13b-delta-v0`

MMMercy2 · 3 years ago

You can use this command to apply the delta weights. (https://github.com/lm-sys/FastChat#vicuna-13b) The delta weights are hosted on huggingface and will be automatically downloaded.

swyx · 3 years ago

so an extra licensing issue to get around the original non commercial license... this is just a research curiosity is it not?

F2hP18Foam · 3 years ago

Seems that way, it would probably be a bad idea to use this for anything commercial at the very least.

stcredzero · 3 years ago

Vicuna at huggingface.com? This keeps making me think of "facehuggers" from Aliens and Vecna from Stranger Things.

(I know a vicuna is a llama like animal.)

0cf8612b2e1e · 3 years ago

Not a lawyer, but that still feels like dubious territory. I would still be on the hook for acquiring the original download, which Facebook has been launching dmca takedown requests for the llama-dl project.

sillysaurusx · 3 years ago

(I work on llama-dl.)

We’re fighting back against the DMCA requests on the basis that NN weights aren’t copyrightable. This thread has details: https://news.ycombinator.com/item?id=35393782

I don't think you have to worry about Facebook going after you. The worst that will happen is that they issue a DMCA, in which case your project gets knocked offline. I don’t think they’ll be going the RIAA route of suing individual hackers.

The DMCAs were also launched by a third party law firm, not Meta themselves, so there’s a bit of “left hand doesn’t know what the right hand is doing” in all of this.

I’ll keep everyone updated. For now, hack freely.

sebzim4500 · 3 years ago

The llama-dl project actually helped you download the weights, whereas this just assumes you already have them. That feels like a pretty massive difference to me.

gaogao · 3 years ago

It's fairly similar to a ROM patch in the video game space, which has mostly stood the test of time.

zhisbug · 3 years ago

https://github.com/facebookresearch/llama/pull/184

capableweb · 3 years ago

Very unlikely you'd face any legal action for usage of anything. If you share it, then it becomes less unlikely.

Edit: Also, judging by a comment from the team in the GitHub repository (https://github.com/lm-sys/FastChat/issues/86#issuecomment-14...), they seem to at least hint about been in contact with the llama team.

Is there some single page that keeps a running status of the various LLVM's and the software to make them runnable on consumer hardware?

takantri · 3 years ago

Hi! Funnily enough I couldn't find much on it either, so that's exactly what I've been working on for the past few months: just in case this kind of question got asked.

I've recently opened a GitHub repository which includes information for both AI model series[0] and frontends you can use to run them[1]. I've wrote a Reddit post beforehand that's messier, but a lot more technical[2].

I try to keep them as up-to-date as possible, but I might've missed something or my info may not be completely accurate. It's mostly to help get people's feet wet.

[0] - https://github.com/Crataco/ai-guide/blob/main/guide/models.m...

[1] - https://github.com/Crataco/ai-guide/blob/main/guide/frontend...

[2] - https://old.reddit.com/user/Crataco/comments/zuowi9/opensour...

muyuu · 3 years ago

consumer hardware is a bit vague of a limitation, which I guess it's partly why people are not tracking precisely what runs on what very closely

these could be useful:

https://nixified.ai

https://github.com/Crataco/ai-guide/blob/main/guide/models.m... -> https://old.reddit.com/user/Crataco/comments/zuowi9/opensour...

https://github.com/cocktailpeanut/dalai

the 4-bit quantized version of LLaMA 13B runs on my laptop without a dedicated GPU and I guess the same would apply to quantized vicuna 13B but I haven't tried that yet (converted as in this link but for 13B instead of 7B https://github.com/ggerganov/llama.cpp#usage )

GPT4All Lora's also works, perhaps the most compelling results I've got yet in my local computer - I have to try quantized Vicuna to see how that one goes, but processing the files to get a 4bit quantized version will take many hours so I'm a bit hesitant

PS: converting 13B Llama took my laptop's i7 around 20 hours and required a large swap file on top of its 16GB of RAM

feel free to answer back if you're trying any of these things this week (later I might lose track)

andai · 3 years ago

Vicuna's GitHub says that applying the delta takes 60GB of CPU RAM? Is that what you meant by large swap file?

On that note, why is any RAM needed? Can't the files be loaded and diffed chunk by chunk?

Edit: The docs for running Koala (a similar model) locally say this (about converting LLaMA to Koala):

>To facilitate training very large language models that does not fit into the main memory of a single machine, EasyLM adopt a streaming format of model checkpoint. The streaming checkpointing format is implemented in checkpoint.py. During checkpointing, the StreamingCheckpointer simply flatten a nested state dictionary into a single level dictionary, and stream the key, value pairs to a file one by one using messagepack. Because it streams the tensors one by one, the checkpointer only needs to gather one tensor from the distributed accelerators to the main memory at a time, hence saving a lot of memory.

https://github.com/young-geng/EasyLM/blob/main/docs/checkpoi...

https://github.com/young-geng/EasyLM/blob/main/docs/koala.md

Presumably the same technique can be used with Vicuna.

sr-latch · 3 years ago

Not a single page, but almost all large language models with open weights are published on this website: https://huggingface.co/models

a5huynh · 3 years ago

gigel82 · 3 years ago

One of my favorite queries to assess an LLM is Is it legal for a man to marry his widow's sister?

Vicuna-13B: The legality of a man marrying his widow's sister (widow marriage) varies depending on the country and cultural context. In some societies, widow marriage was traditionally practiced or tolerated.

GPT-4: It is not possible for a man to marry his widow's sister, as the term "widow" refers to a deceased man's wife. Therefore, if a man has a widow, he is deceased and therefore cannot marry anyone.

GPT 3.5: No, it is not possible for a man to marry his widow's sister because by definition, a widow is a woman whose spouse has died. If a man is deceased, then he cannot marry anyone, including his widow's sister. This is a logical impossibility and would not be recognized as a valid marriage under any legal system.

Koala: The legality of a man marrying his widow's sister varies by jurisdiction. In some countries, such as the United States, it is generally legal for a man to marry his widow's sister. In other countries, such as the United Kingdom, it is illegal for a man to marry his widow's sister because it is considered incest. It is always important to check the laws and regulations in the jurisdiction in which the marriage will take place to ensure that it is legal.

https://chat.lmsys.org/?model=koala-13b

stevenhuang · 3 years ago

Nice test, cool to see gpt4 got it.

You'd probably need to come up with a new one now though, or confirm knowledge cutoff for the next evaluation :p

dc443 · 3 years ago

Ouch. I got this wrong and was under the impression that GPT4 got this wrong for half an hour, and then figured out after reading it again after returning from a walk that this is one hell of a trick question. My brain automatically assumed that a man's widow is the man's dead wife, but I see that the correct way to interpret this is to realize that it means the man is the one who is dead.

It's pretty awesome to realize that from now onward my computers are going to be able to help catch more and more of the holes that clearly exist in my cognition.

would still be possibly legal on the basis that if it's not illegal then it's legal - in British jurisprudence tradition at least https://en.wikipedia.org/wiki/Everything_which_is_not_forbid... - namely it's not law that impedes it (also in some places there's posthumous marriage)

wongarsu · 3 years ago

There are also people who are considered dead by the bureaucratic system, but physically alive. Usually because of clerical errors that are sometimes surprisingly hard to resolve. In this context the wife of the man would be considered a widow in many contexts, despite her man being alive.

moritzwarhier · 3 years ago

Even that charitable interpretation doesn't help much when Vicuna hallucinates the > (widow marriage)

as if it were a common term.

Doesn't make Vicuna less impressive, it comes pretty close to Chat-GPT in many regards. And I like that trick question.

ode · 3 years ago

This model is surprisingly resistant to jailbreaks. Can anyone get any to work via the web UI? https://chat.lmsys.org/

I tried a few from https://www.jailbreakchat.com/ and it refused them all. Interesting.

a2128 · 3 years ago

That might not be surprising considering these jailbreaks are written and tested specifically against ChatGPT and ChatGPT alone. This model probably has its own jailbreaks that would also be refused by ChatGPT

xbmcuser · 3 years ago

Just when you think Nvidia will go down something happens that changes it. These days unless you were into gaming or a machine learning dev the integrated graphics were good enough. But now first time in a long time I am interested in getting a gpu for running some of these chatbots locally.

yreg · 3 years ago

As a very occasional gamer who uses an iMac for work I thought about getting a gaming PC for like 6 years.

Last fall it seemed that all the stars have aligned. The crypto winter and Ethereum switching to proof of stake meant that GPU prices fell to a reasonable level, I knew i would have a bit of a time to play some game during the holidays and as soon as Stable Diffusion was first posted on hacker news I knew that that's my excuse and my sign.

So far I think I have spent more time tinkering with the 20 python environments I have[0] for all the ML projects than playing RDR2.

[0] https://xkcd.com/1987/

When ever I feel like gaming I just subscribe to geforce now service. Around here it costs around ~$10 a month which I usually go go or ~$3 for a single day. And as the servers are located at a local isp no network latency or dropped packets.

atemerev · 3 years ago

This model is also censored to the brim, it refuses to answer half of my questions, some of them perfectly legal. It’s useless, we already have GPT-4 (and Vicuna is even more censored/guarded).

Alpaca-30B is much better, it will even tell you how to build a nuclear weapon (incorrectly, of course, it’s not that smart).

I am waiting for Coati13B weights, these should work great.

Beaver117 · 3 years ago

Why is it locked down? What's the point? Is it locked down if you run locally too or just on the web demo?

adeon · 3 years ago

This looks really good for a run-it-on-your-own-hardware model from the examples and sibling comments. I've been working on a pure AVX2 Rust implementation of LLaMA but was starting to lose interest and been waiting for whatever is the next hot downloadable model, but now I want to add this thing to it.

I'll be busy next few days. Heck yeah.

Are you the GGML dev?

No, my project is called rllama. No relation to GGML. https://github.com/Noeda/rllama