TheEzEzz (u/TheEzEzz)

TheEzEzz commented on GLM-5: Targeting complex systems engineering and long-horizon agentic tasks z.ai/blog/glm-5... · Posted by u/CuriouslyC

Havoc · a month ago

> which begs the question: what else is censored or outright changed intentionally?

So like every other frontier model that has post training to add safeguards in accordance with local norms.

Claude won't help you hotwire a car. Gemini won't write you erotic novels. GPT won't talk about suicide or piracy. etc etc

>This is a classic test

It's a gotcha question with basic zero real world relevance

I'd prefer models to be uncensored too because it does harm overall performance but this is such a non-issue in practice

TheEzEzz · a month ago

The problem with censorship isn't that it degrades performance. The problem is that if the censorship is unilaterally dictated by a government then it becomes a tool for suppression, especially as people use AI more and more for their primary source of information.

A company might choose to avoid erotica because it clashes with their brand, or avoid certain topics because they're worried about causing harms. That is very different than centralized, unilateral control over all information sources.

TheEzEzz commented on 12 Months of Mandarin isaak.net/mandarin/... · Posted by u/misiti3780

jcla1 · a year ago

Do not underestimate the urge to procrastinate (by still doing productive things, like learning Mandarin) while pursuing a PhD.

I am not sure if this will be the author's experience too, but pursuing a PhD will often leave you exhausted without any hope of ever finding "the final missing ingredient" to solve the problem you are currently tackling. So turning to entirely unrelated problems, however productive they may seem to outsides, suddenly becomes an attractive alternative in order to procrastinate.

TheEzEzz · a year ago

I wrote my own dynamic keyboard layout to optimize typing speed while procrastinating on my dissertation.

15 years later I'm still using it. My dissertation not so much.

Procrastination is (sometimes) awesome.

TheEzEzz commented on Andrew Gelman: Is marriage associated with happiness for men or for women? statmodeling.stat.columbi... · Posted by u/paulpauper

Qem · 2 years ago

I'd like to see a study about this, but age-adjusted. I think marriage pays mainly in the long term. It's easier to report you're better single when you still young, healthy, and socializes a lot outside immediate family. As one ages, the balance probably changes. In the extreme case you risk becoming one of these: https://www.bbc.com/news/articles/cwyx6wwp5d5o

I wonder if any differences found by sex are due to age, as it seems men tend to marry older than women, on average.

TheEzEzz · 2 years ago

I could easily see this going the other way. Life long single people develop strong social networks that they keep investing in into old age. Married (and with children especially!) couples invest less time in their social network, in old age they then have many fewer friends when their children leave and their spouse passes.

(I'm not sure this is true or not, but seems plausible. I agree with the author that we should get better data to resolve these questions).

TheEzEzz commented on Structured Outputs in the API openai.com/index/introduc... · Posted by u/davidbarker

dilap · 2 years ago

Isn't your example showing an issue w/ the opposite approach, where someone is getting bad output w/ an earlier openAI json mode that worked via training rather than mechanical output restriction to conform to a schema?

FWIW (not too much!) I have used llama.cpp grammars to restrict to specific formats (not particular json, but an expected format), fine-tuned phi2 models, and I didn't hit any issues like this.

I am not intuitively seeing why restricting sampling to tokens matching a schema would cause the LLM to converge on valid tokens that make no sense...

Are there examples of this happening w/ people using e.g. jsonformer?

TheEzEzz · 2 years ago

You're basically taking the model "off policy" when you bias the decoder, which can definitely make weird things happen.

TheEzEzz commented on σ-GPTs: A new approach to autoregressive models arxiv.org/abs/2404.09562... · Posted by u/mehulashah

optimalsolver · 2 years ago

Yann LeCun would say [0] that it's autoregression itself that's the problem, and ML of this type will never bring us anywhere near AGI.

At the very least you can't solve the hallucination problem while still in the autoregression paradigm.

[0] https://twitter.com/ylecun/status/1640122342570336267

TheEzEzz · 2 years ago

LeCun is very simply wrong in his argument here. His proof requires that all decoded tokens are conditionally independent, or at least that the chance of a wrong next token is independent. This is not the case.

Intuitively, some tokens are harder than others. There may be "crux" tokens in an output, after which the remaining tokens are substantially easier. It's also possible to recover from an incorrect token auto-regressively, by outputting tokens like "actually no..."

TheEzEzz commented on Defending against hypothetical moon life during Apollo 11 eukaryotewritesblog.com/2... · Posted by u/Metacelsus

creer · 2 years ago

You mention media publicizing warning shots. Does that really work at all?

Most of the reporting I see is half-dismissive: [facial recognition is a risk but what are you gonna do? it can't be bad to fight crime.] This goes for everything. And it rarely results in effective control.

Internal practice in biology or chemistry labs kinda does - but takes a long time, and then accidents still happen.

NTSB accident investigations: Is there another field where each accident is taken as seriously as there? And step-wise improvement does not sound like a good solution for self-reproducing agents.

TheEzEzz · 2 years ago

For example with facial recognition, see this outcome with Rite Aid being banned from using it after a "warning shot" https://techcrunch.com/2023/12/20/rite-aid-facial-recognitio...

TheEzEzz commented on Show HN: Auto Wiki – Turn your codebase into a Wiki wiki.mutable.ai... · Posted by u/oshams

TheEzEzz · 2 years ago

Super cool. When I think about accelerating teams while maintaining quality/culture, I think about the adage "if you want someone to do something, make it easy."

Maintaining great READMEs, documentation, onboarding docs, etc, is a lot of work. If Auto Wiki can make this substantially easier, then I think it could flip the calculus and make it much more common for teams to invest in these artifacts. Especially for the millions of internal, unloved repos that actually hold an org together.

TheEzEzz commented on Defending against hypothetical moon life during Apollo 11 eukaryotewritesblog.com/2... · Posted by u/Metacelsus

gwern · 2 years ago

And, what OP downplays, not taking it seriously, having many serious fatal flaws, and then covering all those flaws up while assuring the public everything was going great: https://www.nytimes.com/2023/06/09/science/nasa-moon-quarant... https://www.journals.uchicago.edu/doi/abs/10.1086/724888

Something to think about: even if there are AI 'warning shots', why do you think anyone will be allowed to hear them?

TheEzEzz · 2 years ago

Good question. Perhaps depends on the type of warning shot. Plenty of media has an anti-tech bend and will publicize warning shots if they see them -- and they do this already with near term risks, such as facial recognition.

If the warning shot is from an internal red team, then higher likelihood that it isn't reported. To address that I think we need to continue to improve the culture around safety, so that we increase the odds that a person on or close to that red team blows the whistle if we're stepping toward undisclosed disaster.

I think the bigger risk isn't that we don't hear the warning shots though. It's that we don't get the warning shots, or we get them far too late. Or, perhaps more likely, we get them but are already set on some inexorable path due to competitive pressure. And a million other "or's".

TheEzEzz commented on Defending against hypothetical moon life during Apollo 11 eukaryotewritesblog.com/2... · Posted by u/Metacelsus

yreg · 2 years ago

I agree with you, but to be fair:

- The worst case worry about AI is a much bigger problem than the worst case worry about moon life. (IMHO)

- With moon we had a good idea on how to mitigate the risks just to be extra safe. With AI I believe we don't have any clue on how to do containment / alignment or if it's even possible. What is currently being done on the alignment front (e.g. GPT refusing to write porn stories or scam emails) has absolutely nothing to do with what worries some people about superintelligence.

TheEzEzz · 2 years ago

I agree -- the risks are bigger, the rewards larger, the variance much higher, and the theories much less mature.

But what's striking to me as the biggest difference is the seeming lack of ideological battles in this Moon story. There were differences of opinion on how much precaution to take, how much money to spend, how to make trade offs that may affect the safety of the astronauts, etc. But there's no mention of a vocal ideological group that stands outright opposed to those worried about risks -- or a group that stands opposed to the lunar missions entirely. They didn't politicize the issue and demonize their opponents.

Maybe what we're seeing with the AI risk discussion is just the outcome of social media. The most extreme voices are also the loudest. But we desperately need to recapture a culture of earnest discussion, collaboration, and sanity. We need every builder and every regulator thinking holistically about the risks and the rewards. And we need to think from first principles. This new journey and its outcomes will almost surely be different in unexpected ways.

TheEzEzz commented on Defending against hypothetical moon life during Apollo 11 eukaryotewritesblog.com/2... · Posted by u/Metacelsus

TheEzEzz · 2 years ago

A good analogy for AI risk. We'd never visited the Moon before, or any other celestial object. The risk analysis was not "we've never seen life from a foreign celestial object cause problems on Earth, therefore we aren't worried." The risk analysis was also not "let's never go to the Moon to be _extra_ safe, it's just not worth it."

The analysis was instead "with various methods we can be reasonably confident the Moon is sterile, but the risk of getting this wrong is very high, so we're going to be extra careful just in case." Pressing forward while investing in multiple layers of addressing risk.