crystal_revenge (u/crystal_revenge)

crystal_revenge commented on Ask HN: What Are You Working On? (December 2025) · Posted by u/david927

gozzoo · 6 hours ago

I have asked ChatGPT recently how to de-optimize my life. It seems I'm not the only one who wants go back to the old ways of doing things :)

crystal_revenge · 3 hours ago

I’m pretty sure step one to going back to “the old way” is not to ask ChatGPT

crystal_revenge commented on Opus 4.5 is the first model that makes me fear for my job old.reddit.com/r/ClaudeAI... · Posted by u/nomilk

crystal_revenge · 7 hours ago

I've mainly been using Sonnet 4.5 so decided to give Opus 4.5 a whirl to see if could solve an annoying task I've been working on that Sonnet 4.5 absolutely fails on. Just started with "Are you familiar with <task> and can you help me?" and so far the response has been a resounding:

> Taking longer than usual. Trying again shortly (attempt 1 of 10)

> ...

> Taking longer than usual. Trying again shortly (attempt 10 of 10)

> Due to unexpected capacity constraints, Claude is unable to respond to your message. Please try again soon.

I guess I'll have to wait until later to feel the fear...

crystal_revenge commented on I tried Gleam for Advent of Code blog.tymscar.com/posts/gl... · Posted by u/tymscar

miki123211 · a day ago

There's a flywheel where programmers choose languages that LLMs already understand, but LLMs can only learn languages that programmers write a sufficient amount of code in.

Because LLMs make it that much faster to develop software, any potential advantage you may get from adopting a very niche language is overshadowed by the fact that you can't use it with an LLM. This makes it that much harder for your new language to gain traction. If your new language doesn't gain enough traction, it'll never end up in LLM datasets, so programmers are never going to pick it up.

crystal_revenge · a day ago

> Because LLMs make it that much faster to develop software

I feel as though "facts" such as this are presented to me all the time on HN, but in my every day job I encounter devs creating piles of slop that even the most die-hard AI enthusiasts in my office can't stand and have started to push against.

I know, I know "they just don't know how to use LLMs the right way!!!", but all of the better engineers I know, the ones capable of quickly assessing the output of an LLM, tend to use LLMs much more sparingly in their code. Meanwhile the ones that never really understood software that well in the first place are the ones building agent-based Rube Goldberg machines that ultimately slow everyone down

If we can continue living in the this AI hallucination for 5 more years, I think the only people capable of producing anything of use or value will be devs that continued to devote some of their free time to coding in languages like Gleam, and continued to maintain and sharpen their ability to understand and reason about code.

crystal_revenge commented on No more O'Reilly subscriptions for me zerokspot.com/weblog/2025... · Posted by u/speckx

gruntledfangler · 6 days ago

At one time I worked at a research institute. It had a huge library that was only partially filled. One of the directors wanted to buy every developer their own Safari subscription. The cost was quoted at around $4K/mo IIRC.

I pointed out that it would be far more cost–effective to simple let us request hard copies of whatever books we wanted, and then they would just stay in the library. No one worked remotely at the time.

We ended up getting Safari subscriptions for everyone.

crystal_revenge · 6 days ago

I've worked in similar scenarios and advocated for the Safari subscription. The most obvious problem with the physical book solution is that not everyone can read the same book at the same time. In my experience it's very common that, because some topic is particularly relevant for the team, many people will want to read the same book. At the same time, you do not want 30 copies of a book that was read by everyone 3 years ago sitting on the shelf.

And, as far as expenses go for a research institute, $4k/mo is very inexpensive.

crystal_revenge commented on Why are your models so big? (2023) pawa.lt/braindump/tiny-mo... · Posted by u/jxmorris12

unleaded · 9 days ago

Still relevant today. Many problems people throw onto LLMs can be done more efficiently with text completion than begging a model 20x the size (and probably more than 20x the cost) to produce the right structured output. https://www.reddit.com/r/LocalLLaMA/comments/1859qry/is_anyo...

crystal_revenge · 9 days ago

I used to work very heavily with local models and swore by text completion despite many people thinking it was insane that I would choose not to use a chat interface.

LLMs are designed for text completion and the chat interface is basically a fine-tuning hack to make prompting a natural form of text completion to have a more "intuitive" interface for the average user (I don't even want to think about how many AI "enthusiasts" don't really understand this).

But with open/local models in particular: each instruct/chat interface is slightly different. There are tools that help mitigate this, but the more you're working closely to the model the more likely you are to make a stupid mistake because you didn't understand some detail about how the instruct interface was fine tuned.

Once you accept that LLMs are "auto-complete on steroids" you can get much better results by programming the way they were naturally designed to work. It also helps a lot with prompt engineering because you can more easily understand what the models natural tendency is and work with that to generally get better results.

It's funny because a good chunk of my comments on HN these days are combating AI hype, but man are LLMs really fascinating to work with if you approach them with a bit more clear headed of a perspective.

crystal_revenge commented on What I don’t like about chains of thoughts (2023) samsja.github.io/blogs/co... · Posted by u/jxmorris12

pcwelder · 11 days ago

In RNNs and Transformers we obtain probability distribution of target variable directly and sample using methods like top-k or temprature sampling.

I don't see the equivalence to MCMC. It's not like we have a complex probability function that we are trying to sample from using a chain.

It's just logistic regression at each step.

crystal_revenge · 11 days ago

Right, you're describing sampling a single token which is equivalent to sampling from one step in the Markov Chain. When generating output you're repeating this process and updating your state sequentially which is the definition of the Markov Chain since at each state the embedding (which represents our current state) is conditionally independent of the past.

Every response from an LLM is essentially the sampling of a Markov chain.

crystal_revenge commented on Everyone in Seattle hates AI jonready.com/blog/posts/e... · Posted by u/mips_avatar

adamisom · 11 days ago

Personally, I don't understand how LLMs work. I know some ML math and certainly could learn, and probably will, soon.

But my opinions about what LLMs can do are based on... what LLMs can do. What I can see them doing. With my eyes.

The right answer to the question "What can LLMs do?" is... looking... at what LLMs can do.

crystal_revenge · 11 days ago

I'm sure you're already familiar with the ELIZA effect [0], but you should be a bit skeptical of what you are seeing with your eyes, especially when it comes to language. Humans have an incredible weakness to be tricked by language.

You should be doubly skeptically ever since RLHF has become standard as the model has literally been optimized to give you answers you find most pleasing.

The best way to measure of course is with evaluations, and I have done professional LLM model evaluation work for about 2 years. I've seen (and written) tons of evals and they both impress me and inform my skepticism about the limitations of LLMs. I've also seen countless times where people are convinced "with their eyes" they've found a prompt trick that improves the results, only to be shown that this doesn't pan out when run on a full eval suite.

As an aside: What's fascinating is that it seems our visual system is much more skeptical, an eyeball being slightly off created by a diffusion model will immediately set off alarms where enough clever word play from an LLM will make us drop our guard.

0. https://en.wikipedia.org/wiki/ELIZA_effect

crystal_revenge commented on What I don’t like about chains of thoughts (2023) samsja.github.io/blogs/co... · Posted by u/jxmorris12

dhampi · 11 days ago

I don't understand the analogy.

If I'm using an MCMC algorithm to sample a probability distribution, I need to wait for my Markov chain to converge to a stationary distribution before sampling, sure.

But in no way is 'a good answer' a stationary state in the LLM Markov chain. If I continue running next-token prediction, I'm not going to start looping.

crystal_revenge · 11 days ago

I think you're confusing the sampling process and the convergence of those samples with the warmup process (also called 'burn-in') in HMC. When doing HMC MCMC we typically don't start sampling right away (or, more precisely we throw out those samples) because we may be initializing the sampler in a part of the distribution that involves pretty low probability density. After the chain has run awhile it tends to end up sampling from the typical set which, especially in high dimensional distribution, tends to more correctly represent the distribution we actually want to integrate over.

So for language when I say "Bob has three apples, Jane gives him four and Judy takes two how many apples does Bob have" we're actually pretty far from the part of the linguistic manifold where the correct answer is likely to be. As the chain wanders this space it's getting closer until it finally statistically follow the path "this answer is..." and when it's sampling from this path it's in a much more likely neighborhood of the correct answer. That is, after wandering a bit, more and more of the possible paths are closer to where the actual answer lies than they would be if we had just forced the model to choose early.

edit: Michael Betancourt has great introduction to HMC which covers warm-up and the typical set https://arxiv.org/pdf/1701.02434 (he has a ton more content that dives much more deeply into the specifics)

crystal_revenge commented on Everyone in Seattle hates AI jonready.com/blog/posts/e... · Posted by u/mips_avatar

throwaway-0001 · 11 days ago

I think there is a correlation between when you can you expect from something when I know their internals vs someone that doesn’t know but is not like who knows internals is much much better.

Example: many people created websites without a clue of how they really work. And got millions of people on it. Or had crazy ideas to do things with them.

At the same time there are devs that know how internals work but can’t get 1 user.

pc manufacturers never were able to even imagine what random people were able to do with their pc.

This to say that even if you know internals you can claim you know better, but doesn’t mean it’s absolute.

Sometimes knowing the fundamentals it’s a limitation. Will limit your imagination.

crystal_revenge · 11 days ago

I'm a big fan of the concept of 初心 (Japanese: Shoshin aka "beginners mind" [0] ) and largely agree with Sazuki's famous quote:

> “In the beginner’s mind there are many possibilities, but in the expert’s there are few”

Experts do tend to be limited in what they see as possible. But I don't think that allows carte blanche belief that a fancy Markov Chain will let you transcend humanity. I would argue one of the key concepts of "beginners mind" is not radical assurance in what's possible but unbounded curiosity and willingness to explore with an open mind. Right now we see this in the Stable Diffusion community: there are tons of people who also don't understand matrix multiplication that are doing incredible work through pure experimentation. There's a huge gap between "I wonder what will happen if I just mix these models together" and "we're just a few years from surrendering our will to AI". None of the people I'm concerned about have what I would consider an "open mind" about the topic of AI. They are sure of what they know and to disagree is to invite complete rejection. Hardly a principle of beginners mind.

Additionally:

> pc manufacturers never were able to even imagine what random people were able to do with their pc.

Belies a deep ignorance of the history of personal computing. Honestly, I don't think modern computing has still ever returned to the ambition of what was being dreampt up, by experts, at Xerox PARC. The demos on the Xerox Alto in the early 1970s are still ambitious in some senses. And, as much as I'm not a huge fan, Gates and Jobs absolutely had grand visions for what the PC would be.

0. https://en.wikipedia.org/wiki/Shoshin

crystal_revenge commented on What I don’t like about chains of thoughts (2023) samsja.github.io/blogs/co... · Posted by u/jxmorris12

crystal_revenge · 11 days ago

Decoder only LLMs are Markov chains with sophisticated models of the state space. Anyone familiar with Hamiltonian Markov Chains will know that for good results you need a warm up period so that you're sampling from the typical set which is the area with generally the highest probability density in the distribution (not necessary the high propbability/maximum likelihood).

I have spent a lot of time experimenting with Chain of Thought professionally and I have yet to see any evidence to suggest that what's happening with CoT is any more (or less) than this. If you let the model run a bit longer it enters a region close to the typical set and when it's ready to answer you have a high probability of getting a good answer.

There's absolutely no "reasoning" going on here, except that some times sampling from the typical set near the region of your answer is going to look very similar to how human reason before coming up with an answer.