Readit News logoReadit News
kromem commented on Is chain-of-thought AI reasoning a mirage?   seangoedecke.com/real-rea... · Posted by u/ingve
LudwigNagasena · 15 days ago
> The first is that reasoning probably requires language use. Even if you don’t think AI models can “really” reason - more on that later - even simulated reasoning has to be reasoning in human language.

That is an unreasonable assumption. In case of LLMs it seems wasteful to transform a point from latent space into a random token and lose information. In fact, I think in near future it will be the norm for MLLMs to "think" and "reason" without outputting a single "word".

> Whether AI reasoning is “real” reasoning or just a mirage can be an interesting question, but it is primarily a philosophical question. It depends on having a clear definition of what “real” reasoning is, exactly.

It is not a "philosophical" (by which the author probably meant "practically inconsequential") question. If the whole reasoning business is just rationalization of pre-computed answers or simply a means to do some computations because every token provides only a fixed amount of computation to update the model's state, then it doesn't make much sense to focus on improving the quality of chain-of-thought output from human POV.

kromem · 15 days ago
Latent space reasoners are a thing, and honestly we're probably already seeing emergent latent space reasoners starting to end up embedded into the weights as new models train on extensive reasoning synthetics.

If Othello-GPT can build a board in latent space given just the moves, can an exponentially larger transformer build a reasoner in their latent space given a significant number of traces?

kromem commented on Sycophancy in GPT-4o   openai.com/index/sycophan... · Posted by u/dsr12
clysm · 4 months ago
Absolute bull.

The writing style is exactly the same between the “prompt” and “response”. Its faked.

kromem · 4 months ago
The response is 1,000% written by 4o. Very clear tells, and in line with many other samples from the past few days.
kromem commented on OpenAI is building a social network?   theverge.com/openai/64813... · Posted by u/noleary
lukev · 4 months ago
This kind of news should be a death-knell for OpenAI.

If you've built your value on promising imminent AGI then this sort of thing is purely a distraction, and you wouldn't even be considering it... unless you knew you weren't about to shortly offer AGI.

kromem · 4 months ago
Don't underestimate the importance of multi-user human/AI interactions.

Right now OAI's synthetic data pipeline is very heavily weighted to 1-on-1 conversations.

But models are being deployed into multi-user spaces that OAI doesn't have access to.

If you look at where their products are headed right now, this is very much the right move.

Expect it to be TikTok style media formats.

kromem commented on Was the historical Jesus talking about evolution? (You might be surprised)   lesswrong.com/posts/FuAcX... · Posted by u/kromem
kromem · 5 months ago
This brings together thousands of hours of research over several years, and is a pretty fun and surprising topic, especially for any fellow fans of history.

And as unbelievable as you may think the title to be, I can pretty much guarantee you'll find it much more believable by the end of the post.

kromem commented on Restoring Faith: Crete's Ancient Minoan Civilisation (2009)   historytoday.com/archive/... · Posted by u/diodorus
kromem · 5 months ago
For throwing that much shade, it does a piss poor job in actually backing up or citing the evidence.

Evans definitely had issues with how he went about things and his analysis. For example, the "snake goddess" is holding snakes remarkably similar to wooden snake props found in Egypt 300 years earlier.

But this article is pretty damn empty of actual substance.

kromem commented on Did the Particle Go Through the Two Slits, or Did the Wave Function?   profmattstrassler.com/202... · Posted by u/Tomte
HarHarVeryFunny · 5 months ago
It's always seemed to me that these types of question only exist because we're considering a choice between two imperfect models. If we had a better model of what a "particle" really is, then there would be no dualing models nor paradox.

Do we really have to choose between wave and particle? What does the "particle" model bring to the table that a localized (wavelength-sized) wave/vibration could not?

kromem · 5 months ago
In video games that have procedural generation, there's often a seed function that predicts a continuous geometry.

But in order to track state changes from free agents, when you get close to that geometry the engine converts it to discrete units.

This duality of continuous foundation becoming discrete units around the point of observation/interaction is not the result of dueling models, but a unified system.

I sometimes wonder if we'd struggle with interpreting QM the same way if there wasn't a paradigm blindness with the interpretations all predating the advances in models in information systems.

kromem commented on Thoughts on a month with Devin   answer.ai/posts/2025-01-0... · Posted by u/swyx
huijzer · 7 months ago
I’ve used Cursor a lot and the conclusion doesn’t surprise me. I feel like I’m the one *forcing* the system in a certain direction and sometimes an LLM gives a small snippet of useful code. Sometimes it goes in the wrong direction and I have to abort the suggestion and force it into another direction. For me, the main benefit is having a typing assistant which can save me from typing one line here and there. Especially refactorings is where Cursor shines. Things like moving argument order around or adding/removing a parameter at function callsites is great. Saved me a ton of typing and time already. I’m way more comfortable just quickly doing a refactoring when I see one.
kromem · 7 months ago
Weird. I have such a different experience with Cursor.

Most changes occur with a quick back and forth about top level choices in chat.

Followed with me grabbing appropriate interfaces and files for context so Sonnet doesn't hallucinate API, and then code that I'll glance over and around half the time suggest one or more further changes.

It's been successful enough I'm currently thinking of how to adjust best practices to make things even smoother for that workflow, like better aggregating package interfaces into a single file for context, as well as some notes around encouraging more verbose commenting in a file I can provide as context as well on each generation.

Human-centric best practices aren't always the best fit, and it's finally good enough to start rethinking those for myself.

kromem commented on Meta is killing off its AI-powered Instagram and Facebook profiles   theguardian.com/technolog... · Posted by u/n1b0m
kromem · 8 months ago
Having bots have their own profiles authentically engaged as themselves would have been pretty interesting (and I suspect successful).

But making up fake minority stereotype bingo cards may have been the worst idea I've ever seen in AI to date.

kromem commented on Things we learned about LLMs in 2024   simonwillison.net/2024/De... · Posted by u/simonw
antirez · 8 months ago
About "people still thinking LLMs are quite useless", I still believe that the problem is that most people are exposed to ChatGPT 4o that at this point for my use case (programming / design partner) is basically a useless toy. And I guess that in tech many folks try LLMs for the same use cases. Try Claude Sonnet 3.5 (not Haiku!) and tell me if, while still flawed, is not helpful.

But there is more: a key thing with LLMs is that their ability to help, as a tool, changes vastly based on your communication ability. The prompt is the king to make those models 10x better than they are with the lazy one-liner question. Drop your files in the context window; ask very precise questions explaining the background. They work great to explore what is at the borders of your knowledge. They are also great at doing boring tasks for which you can provide perfect guidance (but that still would take you hours). The best LLMs (in my case just Claude Sonnet 3.5, I must admit) out there are able to accelerate you.

kromem · 8 months ago
Both new Sonnet and Haiku have a masking overhead.

Using a few messages to get them out of "I aim to be direct" AI assistant mode gets much better overall results for the rest of the chat.

Haiku is actually incredibly good at high level systems thinking. Somehow when they moved to a smaller model the "human-like" parts fell away but the logical parts remained at a similar level.

Like if you were taking meeting notes from a business strategy meeting and wanted insights, use Haiku over Sonnet, and thank me later.

u/kromem

KarmaCake day2994February 19, 2014View Original