Readit News logoReadit News
igorkraw commented on I fed 24 years of my blog posts to a Markov model   susam.net/fed-24-years-of... · Posted by u/zdw
TomatoCo · 6 days ago
So learnable, in this context, rhymes with reverse-engineerable?
igorkraw · 6 days ago
Another term used is identifiable (although learnable and identifiable are not synonyms, I think identifiability is one precondition for learnability).

Identifiability means that out of all possible models, you can learn the correct one given enough samples.causal identifiability has some other connotations

See here https://causalai.net/r80.pdf as a good start (a nose in a causal graph is Markov given its parents, and a k-step Markov chain is a k-layer causal dag)

igorkraw commented on Just 0.001% hold 3 times the wealth of poorest half of humanity, report finds   theguardian.com/inequalit... · Posted by u/robtherobber
simianwords · 10 days ago
Rich people having too much wealth is not necessarily that bad a thing because most of the investment is in productive companies.

It’s not like they are using their wealth on frivolous consumption. Which means redistribution would only change who controls the investment and not the actual consumption patterns of people. Implication is that poor people will consume the same as before after redistribution with perhaps some extra assets.

So nothing materially changes other than some security. Poor people will continue to consume the same as before. Bigger problem is it’s not so clear that redistribution is necessarily a good thing because I feel the people who made money are more likely to make better decisions on their own companies.

I don’t know how companies would fare if for example Amazon were redistributed and run like some public company.

(Posting again)

igorkraw · 10 days ago
I see this argument often but for me it misses something.

The difference is about power. The wealth being this concentrated means the power is concentrated.

If people are okay with the idea of an ETF, or a wealth manager (or any type do fund manager/investment bank) then they should be okay with sovereign wealth funds/national ETFs that provide dividends with a guaranteed single share single vote setup.

If you want competition, then the US government used to be good at creating and sustaining artificial compétition in military procurement - similar to how Amazon let's teams compete on the same projects internally.

Because the competition would be artificially and enforced by laws, there's just as much as potential for massive efficiency gains as there is potential for corruption (the Norwegian national wealth fund has gone swimmingly for them)

igorkraw commented on What I don’t like about chains of thoughts (2023)   samsja.github.io/blogs/co... · Posted by u/jxmorris12
dhampi · 17 days ago
No, I still don’t understand the analogy.

All of this burn-in stuff is designed to get your Markov chain to forget where it started.

But I don’t want to get from “how many apples does Bob have?” to a state where Bob and the apples are forgotten. I want to remember that state, and I probably want to stay close to it — not far away in the “typical set” of all language.

Are you implicitly conditioning the probability distribution or otherwise somehow cutting the manifold down? Then the analogy would be plausible to me, but I don’t understand what conditioning we’re doing and how the LLM respects that.

Or are you claiming that we want to travel to the “closest” high probability region somehow? So we’re not really doing burn-in but something a little more delicate?

igorkraw · 17 days ago
You need to think about 1) the latent state 2) the fact that part of the model is post trained to bias the MC towards abiding by the query in the sense of the reward.

A way to look at it is that you effectively have 2 model "heads" inside the LLM, one which generates, one which biases/steers.

The MCMC is initialised based on your prompt, the generator part samples from the language distribution it has learned, while the sharpening/filtering part biases towards stuff that would be likely to have this MCMC give high rewards in the end. So the model regurgitates all the context that is deemed possibly relevant based on traces from the training data (including "tool use", which then injects additional context) and all those tokens shift the latent state into something that is more and more typical of your query.

Importantly, attention acts as a Selector and has multiple heads, and these specialize, so (simplified) one head can maintain focus on your query and "judge" the latent state, while the rest can follow that Markov chain until some subset of the generated+tool injected tokens give enough signal to the "answer now" gate that the middle flips into "summarizing" mode, which then uses the latent state of all of those tokens to actually generate the answer.

So you very much can think of it as sampling repeatedly from an MCMC using a bias, A learned stoping rule and then having a model creating the best possible combination of the traces, except that all this machinery is encoded in the same model weights that get to reuse features between another, for all the benefits and drawbacks that yields.

There was a paper close when OF became a thing that showed that instead of doing CoT, you could just spend that token budget on K parallel shorter queries (by injecting sth. Like "ok, to summarize" and "actually" to force completion ) and pick the best one/majority vote. Since then RLHF has made longer traces more in distribution (although there's another paper that showed as of early 2025 you were trading reduced variance and peak performance as well as loss of edge cases for higher performance on common cases , although this might be ameliorated by now) but that's about the way it broke down 2024-2025

igorkraw commented on Reasoning LLMs are wandering solution explorers   arxiv.org/abs/2505.20296... · Posted by u/Surreal4434
igorkraw · 2 months ago
I'd encourage everyone to learn about Metropolis Hastings Markov chain monte carlo and then squint at lmms, think about what token by token generation of the long rollouts maps to in that framework and consider that you can think of the stop token as a learned stopping criterion accepting (a substring of) the output
igorkraw commented on GPT-5: Overdue, overhyped and underwhelming. And that's not the worst of it   garymarcus.substack.com/p... · Posted by u/kgwgk
kylehotchkiss · 4 months ago
Agreed, the hype cycles need vocal critics. The loudest voices talking about LLMs are the ones who financially benefit the most for it. I’m not anti-AI, I think the hype and gaslighting the entire economy to believe this is the sole thing that is going to render them unemployed is ridiculous (the economy is rough for a myriad of other reasons, most of which come originate from our countries choice in leadership)

Hopefully the innovation slowing means that all the products I use will move past trying to duck tape AI on and start working on actual features/bugs again

igorkraw · 4 months ago
I have a tiny tiny podcast with a friend where we try to break down what parts of the hype are bullshit (muck) and which kernels of truth are there, if any, startedpartially as a place to scream into the void, partially to help the people who are anxious about AGI or otherwise bring harmed by the hype. I think we have a long way to go in terms of presentation (breaking down very technical terms to an audience that is used to vague-hype around "AI" is hard), but we cite our sources, maybe it'll be interesting gpr you to check out out shownotes

https://kairos.fm/muckraikers/

I personally struggle with Gary Marcus critiques because whenever they are about "making ai work" it goes into neurosymbploc "AI" which o have technical disagreements with, and I have _other_ arguments for the points he sometimes raises which I think are more rigorous, so it's difficult to be roughly in the same camp - but overall I'm happy someone with reach is calling BS ad well.

igorkraw commented on Measuring the impact of AI on experienced open-source developer productivity   metr.org/blog/2025-07-10-... · Posted by u/dheerajvs
narush · 5 months ago
Yep, sorry, meant to post this somewhere but forgot in final-paper-polishing-sprint yesterday!

We'll be releasing anonymized data and some basic analysis code to replicate core results within the next few weeks (probably next, depending).

Our GitHub is here (http://github.com/METR/) -- or you can follow us (https://x.com/metr_evals) and we'll probably tweet about it.

igorkraw · 5 months ago
Cool, thanks a lot. Btw, I have a very tiny tiny (50 to 100 audience ) podcast where we try to give context to what we call the "muck" of AI discourse (trying to ground claims into both what we would call objectively observable facts/évidence, and then _separately_ giving out own biased takes), if you would be interested to come on it and chat => contact email in my profile.
igorkraw commented on Measuring the impact of AI on experienced open-source developer productivity   metr.org/blog/2025-07-10-... · Posted by u/dheerajvs
narush · 5 months ago
Hey HN, study author here. I'm a long-time HN user -- and I'll be in the comments today to answer questions/comments when possible!

If you're short on time, I'd recommend just reading the linked blogpost or the announcement thread here [1], rather than the full paper.

[1] https://x.com/METR_Evals/status/1943360399220388093

igorkraw · 5 months ago
Could you either release the dataset (raw but anonymized) for independent statistical évaluation or at least add the absolute times of each dev per task to the paper? I'm curious what the absolute times of each dev with/without AI was and whether the one guy with lots of Cursor experience was actually faster than the rest of just a slow typer getting a big boost out of llms

Also, cool work, very happy to see actually good evaluations instead of just vibes or observational stuies that don't account for the Hawthorne effect

igorkraw commented on A Love Letter to People Who Believe in People   swiss-miss.com/2025/04/a-... · Posted by u/NaOH
HanClinto · 8 months ago
This is so needed. This was a very encouraging article.

"Being a fan is all about bringing the enthusiasm. It’s being a champion of possibility. It’s believing in someone. And it’s contagious. When you’re around someone who is super excited about something, it washes over you. It feels good. You can’t help but want to bring the enthusiasm, too."

Stands in contrast to the Hemingway quote: "Critics are men who watch a battle from a high place then come down and shoot the survivors."

It feels socially safe, easy, and destructive to be a critic.

I'd rather be a fan.

igorkraw · 8 months ago
I really believe in the importance of praising people and acknowledging their efforts, when they are kind and good human beings and (to much lesser degree) their successes.

But, and I mean their without snark: What value is your praise for what is good if I cannot trust that you will be critical of what is bad? Note that critique can be unpleasant but kind, and I don't care for "brutal honesty" (which is much more about the brutality than the honesty in most cases).

But whether it's the joint Slavic-german culture or something else, I much prefer for things to be _appropriate_, _kind_ and _earnest_ instead of just supportive or positive. Real love is despite a flaw, in full cognizance if it, not ignoring them.

igorkraw commented on Don’t let an LLM make decisions or execute business logic   sgnt.ai/p/hell-out-of-llm... · Posted by u/petesergeant
thomassmith65 · 9 months ago
These articles (both positive and negative) are probably popular because it's impossible really to get a rich understanding of what LLMs can do.

So readers want someone to tell them some easy answer.

I have as much as experience using these chatbots as anyone, and I still wouldn't claim to know what they are useless at and what they are great at.

One moment, an LLM will struggle to write a simple state machine. The next, it will write a web app that physically models a snare drum.

Considering the popularity of research papers trying to suss out how these chatbots work, nobody - nobody in 2025, at least - should claim to understand them well.

igorkraw · 9 months ago
What is your definition of "understand them well"?
igorkraw commented on “Vibe Coding” vs. Reality   cendyne.dev/posts/2025-03... · Posted by u/birdculture
MarkMarine · 9 months ago
This is the go community saying a computer will never best human go players.

We already have examples of a model finding more performant sorts [0], given the right incentives and time, and the right system for optimizing (LLMs trained on “average code” probably aren’t it) the computer will best us at creating things for the computer.

Is “vibe coding” real today? Not in my experience, with even Claude code. My hand has to be firmly on the tiller, using my experience and skill to correct its mistakes and guide it. But I can see the current trajectory of improvement, and I’m sure it’ll get there.

[0] https://deepmind.google/discover/blog/alphadev-discovers-fas...

igorkraw · 9 months ago
Check the actual paper on the type of sorts it actually got speedup on :-) (hint: a few percentage points on larger n,similar to what pgo might find, the big speedup is for n around 8 or so, where it basically enumerated and found a sorting network)

u/igorkraw

KarmaCake day1767January 26, 2018
About
hackernews ( a ) krawczuk (point ) eu
View Original