svc0 (u/svc0) - Readit News

svc0 commented on Yes, Ubuntu is withholding security patches for some software flu0r1ne.net/logs/ubuntu_... · Posted by u/svc0

isaacremuant · 2 years ago

That's absolutely terrible and not clear at all.

I've been tempted to go back to Arch and I think this can be a good motivator.

svc0 · 2 years ago

Just to be clear, on Arch ffmpeg is outdated (6.0 vs 6.1.) This means it has three security vulnerabilities.

svc0 commented on Yes, Ubuntu is withholding security patches for some software flu0r1ne.net/logs/ubuntu_... · Posted by u/svc0

panarky · 2 years ago

I'm so confused.

When I install an LTS version with a Universe package like ffmpeg, does everything continue getting security patches for the full five-year LTS life?

Or do I now need Ubuntu Pro to get the full five years?

svc0 · 2 years ago

Universe packages are not supported by Ubuntu unless you activate Ubuntu Pro. Thus, if you install ffmpeg on Ubuntu without Pro, it will contain several active vulnerabilities. The full five years only applies packages in the main repo.

svc0 commented on ChatGPT cut off date now April 2023 chat.openai.com/share/3b0... · Posted by u/qrian

BoorishBears · 2 years ago

No it's not.

People keep sharing these kinds of conversations: the training cutoff date isn't some absolute date from which they never allowed any new data to be trained on.

Instead there are bits and pieces of newer information captured in the updated models, but it's not a meaningful enough amount to ever rely on.

It's not going to reliably understand your new libraries, and more importantly if you convince it that it knows what happened in April 2023, it might start hallucinating extremely deeply: so deeply that conversation becomes useless until you edit it and remove the part where you convinced it of that.

svc0 · 2 years ago

> the training cutoff date isn't some absolute date from which they never allowed any new data to be trained on.

It's not a question of whether they are "allowed" to train on new data; the question is whether they have trained it on data containing information about current events. If you know they've implemented a Continuous Integration (CI) system for this, you should link to a source. However, I don't think this is true, as there would be no reason for a cutoff date otherwise.

> Instead there are bits and pieces of newer information captured in the updated models, but it's not a meaningful enough amount to ever rely on.

This seems more like an opinion of the technology's limitations in general, rather than an assessment of the likelihood that new information will be incorporated into its weights and biases.

svc0 commented on ChatGPT cut off date now April 2023 chat.openai.com/share/3b0... · Posted by u/qrian

spiffytech · 2 years ago

> This is a very academic scenario.

Is it going to remain academic? I can easily imagine the spammy content farm / listicle business model evolving to be fully automated, creating an input loop.

svc0 · 2 years ago

Sure, there will be some pollution. It's very multivariate and depends on factors like content split, generation quality, and novel information. A scenario in which all of your data is generated by the previous model and you run n training loops is academic.

It's also worth noting that when OpenAI created Whisper, they had to heuristically remove many transcripts from poor ASR systems, and they definitely didn't catch them all.

svc0 commented on ChatGPT cut off date now April 2023 chat.openai.com/share/3b0... · Posted by u/qrian

Toutouxc · 2 years ago

The cut off date is now well after the general availability of ChatGPT. Will we start seeing any effects of LLMs getting fed LLM content anytime soon? Are there procedures or filters on the corpus specifically against this?

svc0 · 2 years ago

They are most certainly being fed LLM content. However, I think this "model collapse" narrative is over-subscribed. Here are some things to keep in mind:

(1) Real content is not generated via a synthetic loop: Humans use generative AI in complex ways, intermixing human-generated and AI-generated content. Imagine a person who writes the first draft of an essay, then uses ChatGPT to rewrite parts of it. These are certainly many human additions, modifications, and stylistic flourishes.

(2) The most dramatic effects of model collapse were seen when training multiple generations of AI agents on content generated by the previous agent. This is a very academic scenario.

(3) There is already a lot of junk consumed by these models. RLHF is aimed at eliminating these junk responses. I am not aware of any research that explores how the full training cycle is affected when RLHF is employed.

Also, there is a lot of training material out there that was not used by the original GPT-3 model. The primary limitation is hardware.

svc0 commented on Llama: Add grammar-based sampling github.com/ggerganov/llam... · Posted by u/davepeck

svc0 · 3 years ago

I think it should be noted that this enforces grammatical constraints on the model's generated text, but it doesn't do anything to properly align the content. This would be useful if you needed to ensure a server delivered well-formatted JSON, but it I suspect it wont solve a lot of alignment issues with current language generation. For example current iterations of Llama and GPT often do not label markdown code-blocks correctly. Using grammar-based sampling, you could enforce that it labels code blocks but you couldn't enforce correct labeling since this is context-dependent. You also couldn't invent a novel domain-specific language without aligning against that language and expect good output.

svc0 commented on NewsNotFound: An open-source, unbiased news company newsnotfound.com/whitepap... · Posted by u/newsnotfound

svc0 · 3 years ago

What could possibly go wrong?