Turskarama (u/Turskarama)

Turskarama commented on AI isn't "just predicting the next word" anymore stevenadler.substack.com/... · Posted by u/gmays

minimaltom · a month ago

> In contrast, human thinking doesn’t involve picking a word at a time based on the words that came before

Do we have science that demonstrates humans don't autoregressively emit words? (Genuinely curious / uninformed).

From the outset, its not obvious that auto-regression through the state space of action (i.e. what LLMs do when yeeting tokens) is the difference they have with humans. Though I can guess we can distinguish LLMs from other models like diffusion/HRM/TRM that explicitly refine their output rather than commit to a choice then run `continue;`.

Turskarama · a month ago

Have you ever had a concept you wanted to express, known that there was a word for it, but struggled to remember what the word was? For human thought and speech to work that way it must be fundamentally different to what an LLM does. The concept, the "thought", is separated from the word.

Turskarama commented on The insecure evangelism of LLM maximalists lewiscampbell.tech/blog/2... · Posted by u/todsacerdoti

duendefm · a month ago

Until some days / weeks ago, LLM's for coding was more hype than actually real code producing. That is gone now. They clearly leveled up, things will not be the same anymore. And of course this is not just for coding, this is just the beginning. A month ago it really seemed that the models were hitting a complexity wall and that the architecture would need to be improved. Not anymore.

Turskarama · a month ago

I have seen people say something along these lines what feels like every month for the past year.

Turskarama commented on Gemini 3 Flash: Frontier intelligence built for speed blog.google/products/gemi... · Posted by u/meetpateltech

grog454 · 2 months ago

This thought process is pretty baffling to me, and this is at least the second time I've encountered it on HN.

What's the value of a secret benchmark to anyone but the secret holder? Does your niche benchmark even influence which model you use for unrelated queries? If LLM authors care enough about your niche (they don't) and fake the response somehow, you will learn on the very next query that something is amiss. Now that query is your secret benchmark.

Even for niche topics it's rare that I need to provide more than 1 correction or knowledge update.

Turskarama · 2 months ago

The point is that it's a litmus test for how well the models do with niche knowledge _in general_. The point isn't really to know how well the model works for that specific niche. Ideally of course you would use a few of them and aggregate the results.

Turskarama commented on Working quickly is more important than it seems (2015) jsomers.net/blog/speed-ma... · Posted by u/bschne

taeric · 2 months ago

Executing fast is important, but practice slowly. It is frustrating as heck to admit it, but forcing your body to do something slowly is very effective at learning to do it at speed.

Turskarama · 2 months ago

I think ideally you need to practice both slow AND fast. You need to practice slow so you can notice and work on small details that can be skipped over with speed, and you need to practice fast because some things are legitimately different at speed and you won't learn how to deal with them only going slow.

Turskarama commented on Gemini 3 Flash: Frontier intelligence built for speed blog.google/products/gemi... · Posted by u/meetpateltech

TeodorDyakov · 2 months ago

Hi. I am curious what was the benchmark question? Cheers!

Turskarama · 2 months ago

The problem with publicly disclosing these is that if lots of people adopt them they will become targeted to be in the model and will no longer be a good benchmark.

Turskarama commented on No AI* Here – A Response to Mozilla's Next Chapter waterfox.com/blog/no-ai-h... · Posted by u/MrAlex94

nullbound · 2 months ago

While I do sympathize with the thought behind it, general user is already equating llm chat box as 'better browsing'. In terms of simple positioning vis-a-vis non-technical audience, this is one integration that does make fiscal sense.. if mozilla was a real business.

Now, personally, I would like to have sane defaults, where I can toggle stuff on and off, but we all know which way the wind blows in this case.

Turskarama · 2 months ago

The problem with integrating a chat bot is that what you are effectively doing is the same thing as adding a single bookmark, except now it's taking up extra space. There IS no advantage here, it's unnecessary bloat.

Turskarama commented on Economics of Orbital vs. Terrestrial Data Centers andrewmccalip.com/space-d... · Posted by u/flinner

jmyeet · 2 months ago

I've done some reading on how they cool JWST. It's fascinating and was a massive engineering challenge. Some of thos einstruments need to be cooled to near absolute zero, so much so that it uses liquid helium as a coolant in parts.

Now JWST is at near L2 but it is still in sunlight. It's solar-powered. There are a series of radiating layer to keep heat away from sensitive instruments. Then there's the solar panels themselves.

Obviously an orbital data center wouldn't need some extreme cooling but the key takeaway from me is that the solar panels themselves would shield much of the satellite from direct sunlight, by design.

Absent any external heating, there's only heating from computer chips. Any body in space will radiate away heat. You can make some more effective than others by increasing surface area per unit mass (I assume). Someone else mentioned thermoses as evidence of insulation. There's some truth to that but interestingly most of the heat lost from a thermos is from the same IR radiation that would be emitted by a satellite.

Turskarama · 2 months ago

The computer chips used for AI generate significantly more heat than the chips on the JWST. The JWST in total weighs 6.5 tons and uses a mere 2kw of power, which is the same as 3 H100 GPUs under load, each of which will weight what, 1kg?

So in terms of power density you're looking at about 3 orders of magnitude difference. Heating and cooling is going to be a significant part of the total weight.

Turskarama commented on Helldivers 2 devs slash install size from 154GB to 23GB tomshardware.com/video-ga... · Posted by u/doener

ffsm8 · 2 months ago

They started off with the competitors data, and then moved on once they had their own data though? Not sure what y'all complaining about.

They made an effort to improve the product, but because everything in tech comes with side effects it turned out to be a bad decision which they rolled back. Sounds like highly professional behavior to me by people doing their best. Not everything will always work out, 100% of the time.

And this might finally reverse the trend of games being >100gb as other teams will be able to point to this decision why they shouldn't implement this particular optimization prematurely

Turskarama · 2 months ago

They didn't actually fix this until a couple of months after they publicly revealed that this was the reason the game was so big and a lot of people pointed out how dumb it is. I saw quite a few comments saying that people put it on their storage HDD specifically because it was too big to fit on their SSD. Ironic. They could have got their own data quite a bit earlier during development, not nearly two years after release!

Turskarama commented on Everyone in Seattle hates AI jonready.com/blog/posts/e... · Posted by u/mips_avatar

skissane · 2 months ago

> If you ask for an endpoint to a CRUD API, it'll make one. If you ask for 5, it'll repeat the same code 5 times and modify it for the use case.

I don’t think this is an inherent issue to the technology. Duplicate code detectors have been around for ages. Given an AI agent a tool which calls one, and ask it to reduce duplication, it will start refactoring.

Of course, there is a risk of going too far in the other direction-refactorings which technically reduce duplication but which have unacceptable costs (you can be too DRY). But some possible solutions: (a) ask it to judge if the refactoring is worth it or not - if it judges no, just ignore the duplication and move on; (b) get a human to review the decision in (a); (c) if AI repeatedly makes wrong decision (according to human), prompt engineering, or maybe even just some hardcoded heuristics

Turskarama · 2 months ago

It actually is somewhat a limit of the technology. LLMs can't go back and modify their own output, later tokens are always dependent on earlier tokens and they can't do anything out of order. "Thinking" helps somewhat by allowing some iteration before they give the user actual output, but that requires them to write it the long way and THEN refactor it without being asked, which is both very expensive and something they have to recognize the user wants.

Turskarama commented on I Stopped Being a Climate Catastrophist breakthroughjournal.org/p... · Posted by u/paulpauper

epistasis · 3 months ago

During some of the worst starvation events in the 20th century, it was still only on the order of ~10 million people that died. And most of those deaths were because horrific totalitarian governments prevented outside aid to the affected regions.

I have not seen evidence that there will be food system collapse driven by climate change that would be worse than those events, but my ears are open if you have some.

Turskarama · 3 months ago

There is one absolutely massive one, and that's that for the first time the problem is truly global. Other famines have been caused either by war or local droughts, both of which affect only a population in a limited area and crucially, can be somewhat mitigated by importing food from elsewhere. You can't import food if there are global food shortages.