Readit News logoReadit News
vlovich123 commented on Speed up responses with fast mode   code.claude.com/docs/en/f... · Posted by u/surprisetalk
kingstnap · 3 days ago
It comes from batching and multiple streams on a GPU. More people sharing 1 GPU makes everyone run slower but increases overall token throughput.

Mathematically it comes from the fact that this transformer block is this parallel algorithm. If you batch harder, increase parallelism, you can get higher tokens/s. But you get less throughput. Simultaneously there is also this dial that you can speculatively decode harder with fewer users.

Its true for basically all hardware and most models. You can draw this Pareto curve of how much throughput per GPU vs how many tokens per second per stream. More tokens/s less total throughput.

See this graph for actual numbers:

Token Throughput per GPU vs. Interactivity gpt-oss 120B • FP4 • 1K / 8K • Source: SemiAnalysis InferenceMAX™

https://inferencemax.semianalysis.com/

vlovich123 · 3 days ago
> If you batch harder, increase parallelism, you can get higher tokens/s. But you get less throughput. Simultaneously there is also this dial that you can speculatively decode harder with fewer users.

I think you skipped the word “total throughout” there right? Cause tok/s is a measure of throughput, so it’s clearer to say you increase throughput/user at the expense of throughput/gpu.

I’m not sure about the comment about speculative decode though. I haven’t served a frontier model but generally speculative decode I believe doesn’t help beyond a few tokens, so I’m not sure you can “speculatively decode harder” with fewer users.

vlovich123 commented on A century of hair samples proves leaded gas ban worked   arstechnica.com/science/2... · Posted by u/jnord
tokyobreakfast · 3 days ago
This is a myth. While tailpipe emissions are lower, evaporative emissions are higher. At best it's a draw.

You're seeing less smog because people are driving modern cars with modern emission systems because we live in the future, smog-producing vehicles have been taken out of service, and drawing conclusions based on mere correlation of the two. It has nothing to do with ethanol.

vlovich123 · 3 days ago
Sounds like someone who’s been fed a line and has believed it.

A) why do you think car companies started to need to develop more modern emission systems to begin with? That’s right - California, a huge car market, started creating and enforcing standards through the introduction of CARB. Prior to this car companies had no incentives and weren’t doing this

B) there’s more to smog than just cars. CARB tackled emissions across multiple industries.

C) average cars last too long. The reason cars modernized was because CARB made owning and operating older vehicles impractical/impossible.

D) population and vehicle miles driven kept growing so per unit emissions need to shrink faster than that growth and it did. Thanks to CARB.

Is ethanol the primary reason we don’t have smog now? No, but the problem was so bad that CARB took a comprehensive approach at tackling the problem on many angles. And importantly they succeeded. It’s quite a silly position to take that “this problem would have solved itself”. It’s the twin to the fatalism position of “this problem is too big and complicated to solve”

vlovich123 commented on Microsoft open-sources LiteBox, a security-focused library OS   github.com/microsoft/lite... · Posted by u/aktau
sharts · 3 days ago
Don’t the best of the best typically work on OS fundamentals though?
vlovich123 · 3 days ago
OS is such a broad term, especially when applied to Windows which is closer to a Linux distro. Is it the kernel? Windows is fine there as by all accounts the issues are higher up. They’ve had some problems with their update process which is surprising - historically that team would have been populated by the better engineers. most of the other problems have been in the shell and UI where good engineering discipline is not to be quite as expected.
vlovich123 commented on A century of hair samples proves leaded gas ban worked   arstechnica.com/science/2... · Posted by u/jnord
tokyobreakfast · 3 days ago
We already have trad-gas.

It's called ethanol-free and people gladly pay a premium for it.

It's far better for your engine, it's what the car manufacturers use to determine the gas mileage, and Californians can only dream of having it.

vlovich123 · 3 days ago
Ethanol free fuel in an engine designed for ethanol blend can result in incomplete combustion, leaving deposits in fuel injectors and valves. The car companies don’t care about this for the purposes of determining gas mileage.

Californians I think still remember the smog of the 90s in LA that kicked us to make our air pollution standards the highest in the nation. Going away from that I think sounds more like a nightmare than a dream for most Californians.

vlovich123 commented on LLMs could be, but shouldn't be compilers   alperenkeles.com/posts/ll... · Posted by u/alpaylan
9rx · 4 days ago
> LLMs are not deterministic

They are designed to be where temperature=0. Some hardware configurations are known defy that assumption, but when running on perfect hardware they most definitely are.

What you call compilers are also nondeterministic on 'faulty' hardware, so...

vlovich123 · 4 days ago
Even with temperature and a batch size of 1 and fixed seed LLMs should be deterministic. Of course batch size of 1 is not economical.
vlovich123 commented on LLMs could be, but shouldn't be compilers   alperenkeles.com/posts/ll... · Posted by u/alpaylan
bee_rider · 4 days ago
Are conventional compilers actually deterministic, with all the bells and whistles enabled? PGO seems like it ought to have a random element.
vlovich123 · 4 days ago
No, modulo bugs generally the same set of inputs to a compiler are guaranteed to produce the same output bit for bit which is the definition of determinism.

There’s even efforts to guarantee this for many packages on Linux - it’s a core property of security because it lets you validate that the compilation process or environment wasn’t tampered with illicitly by being able to verify by building from scratch.

Now actually managing to fix all inputs and getting deterministic output can be challenging, but that’s less to do with the compiler and more to do with the challenge of completely taking the entire environment (the profile you are using for PGO, isolating paths on the build machine being injected into the binary, programs that have things in their source or build system that’s non deterministic (e.g. incorporating the build time into the binary)

vlovich123 commented on Top downloaded skill in ClawHub contains malware   1password.com/blog/from-m... · Posted by u/pelario
mattstir · 5 days ago
This just seems like the logical consequence of the chosen system to be honest. "Skills" as a concept are much too broad and much too free-form to have any chance of being secure. Security has also been obviously secondary in the OpenClaw saga so far, with users just giving it full permissions to their entire machine and hoping for the best. Hopefully some of this will rekindle ideas that are decades old at this point (you know, considering security and having permission levels and so forth), but I honestly have my doubts.
vlovich123 · 5 days ago
I think the truth is we don’t know what to do here. The whole point of an ideal AI agent is to do anything you tell it to - permissions and sandboxing would negate that. I think the uncomfortable truth is as an industry we don’t actually know what to do other than say “don’t use AI” or “well it’s your fault for giving it too many permissions”. My hunch is that it’ll become an arms race with AI trying to find malware developed by humans/AI and humans/AI trying to develop malware that’s not detectable.

Sandboxing and permissions may help some, but when you have self modifying code that the user is trying to get to impersonate them, it’s a new challenge existing mechanisms have not seen before. Additionally, users don’t even know the consequences of an action. Hell, even curated and non curated app stores have security and malware difficulties. Pretending it’s a solved problem with existing solutions doesn’t help us move forward.

vlovich123 commented on CIA to Sunset the World Factbook   abc.net.au/news/2026-02-0... · Posted by u/kshahkshah
nanna · 5 days ago
Bit confused, what's this to do with the CIA World Factbook?
vlovich123 · 5 days ago
> this book has little difference between total words and distinct words because it has so many distinct numbers in it. It ended up being a regular stress test to make sure our approach to capping memory use was working
vlovich123 commented on A case study in PDF forensics: The Epstein PDFs   pdfa.org/a-case-study-in-... · Posted by u/DuffJohnson
mikkupikku · 6 days ago
I know I'm not the brightest bulb by any measure, but do some people really take less than at least a few minutes to come up with one-liners for problems as novel as graphical transformations to PDFs? Maybe if the presumed techie hacker / federal worker took it as an amusing challenge I could see this being done, but genuinely out of pure laziness? That's incredible if true.
vlovich123 · 5 days ago
It’s a mix of “they’ve done it many times before” and these days AI. But remember the “they’ve done it many times before” just means that in a technical and popular forum you’re likely to find the handful of people who have done so regularly enough to remember the one liner. Also this is probably easily searchable as well so even prior to AI not super hard.
vlovich123 commented on FBI couldn't get into WaPo reporter's iPhone because Lockdown Mode enabled   404media.co/fbi-couldnt-g... · Posted by u/robin_reala
stouset · 6 days ago
Absolutely every aspect of it?

What’s so hard about adding a feature that effectively makes a single-user device multi-user? Which needs the ability to have plausible deniability for the existence of those other users? Which means that significant amounts of otherwise usable space needs to be inaccessibly set aside for those others users on every device—to retain plausible deniability—despite an insignificant fraction of customers using such a feature?

What could be hard about that?

vlovich123 · 6 days ago
iPhone and macOS are basically the same product technically. The reason iPhone is a single user product is UX decisions and business/product philosophy, not technical reasons.

While plausible deniability may be hard to develop, it’s not some particularly arcane thing. The primary reasons against it are the political balancing act Apple has to balance (remember San Bernardino and the trouble the US government tried to create for Apple?). Secondary reasons are cost to develop vs addressable market, but they did introduce Lockdown mode so it’s not unprecedented to improve the security for those particularly sensitive to such issues.

u/vlovich123

KarmaCake day14772January 16, 2018
About
vlovich @ whatever
View Original