msp26 (u/msp26) - Readit News

msp26 commented on · Posted by u/iamflimflam1

msp26 · 11 days ago

what the fuck is this slop? Don't name your shit grifts after (the codenames of) actual highly anticipated models.

msp26 commented on Anna's Archive: An Update from the Team annas-archive.org/blog/an... · Posted by u/jerheinze

lolive · 15 days ago

I choose the books I buy, from Anna's Archive. I choose the comics I buy from readComicsOnline. I choose the [european] graphic novels I buy from #WONTTELL.

And I am one of the best customers of these 3 physical shops, in my town.

So sure, I don't buy the latest trends based on ads. I investigate a lot to buy GREAT stuff. Sometimes the shopkeeper has headaches to find the obscure stuff I discovered online that NOBODY knows it exists.

Am I an exception?

I don't know but those services are great to maintain a freedom of choice.

msp26 · 15 days ago

The french comic pirate scene has an interesting rule where they keep a ~6 month time lag on what they release. The scene is small enough that the rule generally works.

It's a really good trade-off. I would never have gotten into these comics without piracy but now if something catches my eye, I don't mind buying on release (and stripping the DRM for personal use).

Most of my downloading is closer to collecting/hoarding/cataloguing behaviour but if I fully read something I enjoy, I'll support the author in some way.

msp26 commented on GPT-5 for Developers openai.com/index/introduc... · Posted by u/6thbit

mehmetoguzderin · a month ago

Context-free grammar and regex support are exciting. I wonder what, or whether, there are differences from the Lark-like CFG of llguidance, which powers the JSON schema of the OpenAI API [^1].

[^1]: https://github.com/guidance-ai/llguidance/blob/f4592cc0c783a...

msp26 · a month ago

Yeah that was the only exciting part of the announcement for me haha. Can't wait to play around with it.

I'm already running into a bunch of issues with the structured output APIs from other companies like Google and OpenAI have been doing a great job on this front.

msp26 commented on Sweatshop Data Is Over mechanize.work/blog/sweat... · Posted by u/whoami_nr

jrimbault · a month ago

> This meant that while Google was playing games, OpenAI was able to seize the opportunity of a lifetime. What you train on matters.

Very weird reasoning. Without AlphaGo, AlphaZero, there's probably no GPT ? Each were a stepping stone weren't they?

msp26 · a month ago

OpenAI's work on Dota was also very important for funding

msp26 commented on Chain of thought monitorability: A new and fragile opportunity for AI safety arxiv.org/abs/2507.11473... · Posted by u/mfiguiere

msp26 · 2 months ago

Are us plebs allowed to monitor the CoT tokens we pay for, or will that continue to be hidden on most providers?

msp26 commented on Agentic Misalignment: How LLMs could be insider threats anthropic.com/research/ag... · Posted by u/helloplanets

msp26 · 2 months ago

Merge comments? https://news.ycombinator.com/item?id=44331150

I'm really getting bored of Anthropic's whole song and dance with 'alignment'. Krackers in the other thread explains it in better words.

msp26 commented on Meta's Llama 3.1 can recall 42 percent of the first Harry Potter book understandingai.org/p/met... · Posted by u/aspenmayer

TeMPOraL · 3 months ago

Well, so can a nontrivial number of people. It's Harry Potter we're talking about - it's up there with The Bible in popularity ranking.

I'm gonna bet that Llama 3.1 can recall a significant portion of Pride and Prejudice too.

With examples of this magnitude, it's normal and entirely expected this can happen - as it does with people[0] - the only thing this is really telling us is that the model doesn't understand its position in the society well enough to know to shut up; that obliging the request is going to land it, or its owners, into trouble.

In some way, it's actually perverted.

EDIT: it's even worse than that. What the research seems to be measuring is that the models recognize sentence-sized pieces of the book as likely continuations of an earlier sentence-sized piece. Not whether it'll reproduce that text when used straightforwardly - just whether there's an indication it recognizes the token patterns as likely.

By that standard, I bet there's over a billion people right now who could do that to 42% of first Harry Potter book. By that standard, I too memorized the Bible end-to-end, as had most people alive today, whether or not they're Christian; works this popular bleed through into common language usage patterns.

--

[0] - Even more so when you relax your criteria to accept occasional misspell or paraphrase - then each of us likely know someone who could piece together a chunk of HP book from memory.

msp26 · 3 months ago

Agree completely. When I read the Gemma 3 paper (https://arxiv.org/html/2503.19786v1) and saw an entire section dedicated to measuring and reducing the memorization rate I was annoyed. How does this benefit end users at all?

I want the language model I'm using to have knowledge of cultural artifacts. Gemma 3 27B was useless at a question related to grouping Berserk characters by potential baldurs gate 3 classes; Claude did fine. The methods used to reduce memorisation rate probably also deteriorate performance in some other ways that don't show up on benchmarks.

msp26 commented on Building an AI server on a budget informationga.in/blog/bui... · Posted by u/mful

msp26 · 3 months ago

> 12GB vram

waste of effort, why would you go through the trouble of building + blogging for this?

msp26 commented on Claude 4 anthropic.com/news/claude... · Posted by u/meetpateltech

msp26 · 3 months ago

> Finally, we've introduced thinking summaries for Claude 4 models that use a smaller model to condense lengthy thought processes. This summarization is only needed about 5% of the time—most thought processes are short enough to display in full. Users requiring raw chains of thought for advanced prompt engineering can contact sales about our new Developer Mode to retain full access.

Extremely cringe behaviour. Raw CoTs are super useful for debugging errors in data extraction pipelines.

After Deepseek R1 I had hope that other companies would be more open about these things.

msp26 commented on Discord Unveiled: A Comprehensive Dataset of Public Communication (2015-2024) arxiv.org/abs/2502.00627... · Posted by u/leotravis10

msp26 · 3 months ago

Fantastic. I wonder how many random technical info is buried in these servers. I hate what it's done for game modding.