vlovich123 (u/vlovich123)

vlovich123 commented on Speed up responses with fast mode code.claude.com/docs/en/f... · Posted by u/surprisetalk

kingstnap · 3 days ago

It comes from batching and multiple streams on a GPU. More people sharing 1 GPU makes everyone run slower but increases overall token throughput.

Mathematically it comes from the fact that this transformer block is this parallel algorithm. If you batch harder, increase parallelism, you can get higher tokens/s. But you get less throughput. Simultaneously there is also this dial that you can speculatively decode harder with fewer users.

Its true for basically all hardware and most models. You can draw this Pareto curve of how much throughput per GPU vs how many tokens per second per stream. More tokens/s less total throughput.

See this graph for actual numbers:

Token Throughput per GPU vs. Interactivity gpt-oss 120B • FP4 • 1K / 8K • Source: SemiAnalysis InferenceMAX™

https://inferencemax.semianalysis.com/

vlovich123 · 3 days ago

> If you batch harder, increase parallelism, you can get higher tokens/s. But you get less throughput. Simultaneously there is also this dial that you can speculatively decode harder with fewer users.

I think you skipped the word “total throughout” there right? Cause tok/s is a measure of throughput, so it’s clearer to say you increase throughput/user at the expense of throughput/gpu.

I’m not sure about the comment about speculative decode though. I haven’t served a frontier model but generally speculative decode I believe doesn’t help beyond a few tokens, so I’m not sure you can “speculatively decode harder” with fewer users.

vlovich123 commented on A century of hair samples proves leaded gas ban worked arstechnica.com/science/2... · Posted by u/jnord

tokyobreakfast · 3 days ago

This is a myth. While tailpipe emissions are lower, evaporative emissions are higher. At best it's a draw.

You're seeing less smog because people are driving modern cars with modern emission systems because we live in the future, smog-producing vehicles have been taken out of service, and drawing conclusions based on mere correlation of the two. It has nothing to do with ethanol.

vlovich123 · 3 days ago

Sounds like someone who’s been fed a line and has believed it.

A) why do you think car companies started to need to develop more modern emission systems to begin with? That’s right - California, a huge car market, started creating and enforcing standards through the introduction of CARB. Prior to this car companies had no incentives and weren’t doing this

B) there’s more to smog than just cars. CARB tackled emissions across multiple industries.

C) average cars last too long. The reason cars modernized was because CARB made owning and operating older vehicles impractical/impossible.

D) population and vehicle miles driven kept growing so per unit emissions need to shrink faster than that growth and it did. Thanks to CARB.

Is ethanol the primary reason we don’t have smog now? No, but the problem was so bad that CARB took a comprehensive approach at tackling the problem on many angles. And importantly they succeeded. It’s quite a silly position to take that “this problem would have solved itself”. It’s the twin to the fatalism position of “this problem is too big and complicated to solve”

vlovich123 commented on Microsoft open-sources LiteBox, a security-focused library OS github.com/microsoft/lite... · Posted by u/aktau

sharts · 3 days ago

Don’t the best of the best typically work on OS fundamentals though?

vlovich123 · 3 days ago

OS is such a broad term, especially when applied to Windows which is closer to a Linux distro. Is it the kernel? Windows is fine there as by all accounts the issues are higher up. They’ve had some problems with their update process which is surprising - historically that team would have been populated by the better engineers. most of the other problems have been in the shell and UI where good engineering discipline is not to be quite as expected.

vlovich123 commented on A century of hair samples proves leaded gas ban worked arstechnica.com/science/2... · Posted by u/jnord

tokyobreakfast · 3 days ago

We already have trad-gas.

It's called ethanol-free and people gladly pay a premium for it.

It's far better for your engine, it's what the car manufacturers use to determine the gas mileage, and Californians can only dream of having it.

vlovich123 · 3 days ago

Ethanol free fuel in an engine designed for ethanol blend can result in incomplete combustion, leaving deposits in fuel injectors and valves. The car companies don’t care about this for the purposes of determining gas mileage.

Californians I think still remember the smog of the 90s in LA that kicked us to make our air pollution standards the highest in the nation. Going away from that I think sounds more like a nightmare than a dream for most Californians.

vlovich123 commented on LLMs could be, but shouldn't be compilers alperenkeles.com/posts/ll... · Posted by u/alpaylan

9rx · 4 days ago

> LLMs are not deterministic

They are designed to be where temperature=0. Some hardware configurations are known defy that assumption, but when running on perfect hardware they most definitely are.

What you call compilers are also nondeterministic on 'faulty' hardware, so...

vlovich123 · 4 days ago

Even with temperature and a batch size of 1 and fixed seed LLMs should be deterministic. Of course batch size of 1 is not economical.

vlovich123 commented on LLMs could be, but shouldn't be compilers alperenkeles.com/posts/ll... · Posted by u/alpaylan

bee_rider · 4 days ago

Are conventional compilers actually deterministic, with all the bells and whistles enabled? PGO seems like it ought to have a random element.

vlovich123 · 4 days ago

No, modulo bugs generally the same set of inputs to a compiler are guaranteed to produce the same output bit for bit which is the definition of determinism.

There’s even efforts to guarantee this for many packages on Linux - it’s a core property of security because it lets you validate that the compilation process or environment wasn’t tampered with illicitly by being able to verify by building from scratch.

Now actually managing to fix all inputs and getting deterministic output can be challenging, but that’s less to do with the compiler and more to do with the challenge of completely taking the entire environment (the profile you are using for PGO, isolating paths on the build machine being injected into the binary, programs that have things in their source or build system that’s non deterministic (e.g. incorporating the build time into the binary)

vlovich123 commented on Top downloaded skill in ClawHub contains malware 1password.com/blog/from-m... · Posted by u/pelario

mattstir · 5 days ago

This just seems like the logical consequence of the chosen system to be honest. "Skills" as a concept are much too broad and much too free-form to have any chance of being secure. Security has also been obviously secondary in the OpenClaw saga so far, with users just giving it full permissions to their entire machine and hoping for the best. Hopefully some of this will rekindle ideas that are decades old at this point (you know, considering security and having permission levels and so forth), but I honestly have my doubts.

vlovich123 · 5 days ago

I think the truth is we don’t know what to do here. The whole point of an ideal AI agent is to do anything you tell it to - permissions and sandboxing would negate that. I think the uncomfortable truth is as an industry we don’t actually know what to do other than say “don’t use AI” or “well it’s your fault for giving it too many permissions”. My hunch is that it’ll become an arms race with AI trying to find malware developed by humans/AI and humans/AI trying to develop malware that’s not detectable.

Sandboxing and permissions may help some, but when you have self modifying code that the user is trying to get to impersonate them, it’s a new challenge existing mechanisms have not seen before. Additionally, users don’t even know the consequences of an action. Hell, even curated and non curated app stores have security and malware difficulties. Pretending it’s a solved problem with existing solutions doesn’t help us move forward.

vlovich123 commented on CIA to Sunset the World Factbook abc.net.au/news/2026-02-0... · Posted by u/kshahkshah

nanna · 5 days ago

Bit confused, what's this to do with the CIA World Factbook?

vlovich123 · 5 days ago

> this book has little difference between total words and distinct words because it has so many distinct numbers in it. It ended up being a regular stress test to make sure our approach to capping memory use was working

vlovich123 commented on A case study in PDF forensics: The Epstein PDFs pdfa.org/a-case-study-in-... · Posted by u/DuffJohnson

mikkupikku · 6 days ago

I know I'm not the brightest bulb by any measure, but do some people really take less than at least a few minutes to come up with one-liners for problems as novel as graphical transformations to PDFs? Maybe if the presumed techie hacker / federal worker took it as an amusing challenge I could see this being done, but genuinely out of pure laziness? That's incredible if true.

vlovich123 · 5 days ago

It’s a mix of “they’ve done it many times before” and these days AI. But remember the “they’ve done it many times before” just means that in a technical and popular forum you’re likely to find the handful of people who have done so regularly enough to remember the one liner. Also this is probably easily searchable as well so even prior to AI not super hard.

vlovich123 commented on FBI couldn't get into WaPo reporter's iPhone because Lockdown Mode enabled 404media.co/fbi-couldnt-g... · Posted by u/robin_reala

stouset · 6 days ago

Absolutely every aspect of it?

What’s so hard about adding a feature that effectively makes a single-user device multi-user? Which needs the ability to have plausible deniability for the existence of those other users? Which means that significant amounts of otherwise usable space needs to be inaccessibly set aside for those others users on every device—to retain plausible deniability—despite an insignificant fraction of customers using such a feature?

What could be hard about that?

vlovich123 · 6 days ago

iPhone and macOS are basically the same product technically. The reason iPhone is a single user product is UX decisions and business/product philosophy, not technical reasons.

While plausible deniability may be hard to develop, it’s not some particularly arcane thing. The primary reasons against it are the political balancing act Apple has to balance (remember San Bernardino and the trouble the US government tried to create for Apple?). Secondary reasons are cost to develop vs addressable market, but they did introduce Lockdown mode so it’s not unprecedented to improve the security for those particularly sensitive to such issues.