Readit News logoReadit News
troelsSteegin commented on The Department of War just shot the accountants and opted for speed   steveblank.com/2025/11/11... · Posted by u/ridruejo
troelsSteegin · a month ago
A big assumption with this change is that the "Modular Open Systems Approach" (MOSA) [0] [1] will be adequate for integrating new systems developed and acquired under this "fast track". MOSA appears to be about 6 years old as a mandate [2] and is something that big contractors - SAIC, BAI, Palantir [3] - talk about. But, 6 years seems brand new in this sector. I'd be curious to see if LLM's have leverage for MOSA software system integrations.

[0] https://breakingdefense.com/tag/modular-open-systems-archite...

[1] https://www.dsp.dla.mil/Programs/MOSA/

[2] https://www.govinfo.gov/app/details/USCODE-2016-title10/USCO...

[3] https://blog.palantir.com/implementing-mosa-with-software-de...

troelsSteegin commented on The Smol Training Playbook: The Secrets to Building World-Class LLMs   huggingface.co/spaces/Hug... · Posted by u/kashifr
lewtun · 2 months ago
Hi, Lewis here (one of the co-authors). Happy to answer any questions people have about the book :)
troelsSteegin · a month ago
This was a good read. I was struck by the quantity of nuanced and applied knowhow it took to build SmolLM3. I am curious about the rough cost it took to engineer and train SmolLM3 - at ~400 GPUS for a least a month, and, based on the set of book co-authors, 12 engineers for at least three months. Is $3-5M a fair ballpark number? The complement is how much experience, on average, the team members had doing ML and LLM training at scale before SmolLM3. The book is "up" on recent research, so I am surmising a phd-centric team each with multiple systems built. This is not commodity skill. What the book suggests to me is that an LLM applications start up would best focus on understanding the scope and knowhow for starting from post-training.
troelsSteegin commented on Sylvia Plath's fig tree meets machine learning   dontlognow.substack.com/p... · Posted by u/batkin
troelsSteegin · 3 months ago
One could look at feed-forward decision trees as representing the idea that preferences are latent and immutable, and that the optimal branch is the truest expression of innate preferences. And, one could look at backpropagation as adjusting preferences to accomodate situational constraints -- or as learning to want what is good for you, where what is good for you is defined by some external or imposed metric. Tragically, Plath was unable to "backpropagate". Was attention all she needed?
troelsSteegin commented on Language models pack billions of concepts into 12k dimensions   nickyoder.com/johnson-lin... · Posted by u/lawrenceyan
mallowdram · 3 months ago
Space embedding based on arbitrary points never resolves to specifics. Particularly downstream. Words are arbitrary, we remained lazy at an unusually vague level of signaling because arbitrary signals provide vast advantages for the sender and controller of the signal. Arbitrary signals are essentially primate dominance tools. They are uniquely one-way. CS never considered this. It has no ability to subtract that dark matter of arbitrary primate dominance that's embedded in the code. Where is this in embedded space?

LLMs are designed for Western concepts of attributes, not holistic, or Eastern. There's not one shred of interdependence, each prediction is decontextualized, the attempt to reorganize by correction only slightly contextualizes. It's the object/individual illusion in arbitrary words that's meaningless. Anyone studying Gentner, Nisbett, Halliday can take a look at how LLMs use language to see how vacant they are. This list proves this. LLMs are the equivalent of circus act using language.

"Let's consider what we mean by "concepts" in an embedding space. Language models don't deal with perfectly orthogonal relationships – real-world concepts exhibit varying degrees of similarity and difference. Consider these examples of words chosen at random: "Archery" shares some semantic space with "precision" and "sport" "Fire" overlaps with both "heat" and "passion" "Gelatinous" relates to physical properties and food textures "Southern-ness" encompasses culture, geography, and dialect "Basketball" connects to both athletics and geometry "Green" spans color perception and environmental consciousness "Altruistic" links moral philosophy with behavioral patterns"

troelsSteegin · 3 months ago
> Arbitrary signals are essentially primate dominance tools.

What should I read to better understand this claim?

> LLMs are the equivalent of circles act using language.

Circled apes?

troelsSteegin commented on White House Orders NASA to Destroy Important Satellite   futurism.com/white-house-... · Posted by u/BoredPositron
troelsSteegin · 4 months ago
They should at least try to sell it. Fine line, though, between leverage and extortion.
troelsSteegin commented on An Algorithm for a Better Bookshelf   cacm.acm.org/news/an-algo... · Posted by u/pseudolus
jasonthorsness · 6 months ago
"Their new algorithm adapts to an adversary’s strategy, but on time scales that it picks randomly"

"Even though many real-world data settings are not adversarial, situations without an adversary can still sometimes involve sudden floods of data to targeted spots, she noted."

This is pretty neat. I bet this will find practical applications.

troelsSteegin · 6 months ago
Are "adversaries" broadly used in algorithm design? I've not seen that before. I'm used to edge cases and trying to break things, but an "adversary", especially white box, seems different.
troelsSteegin commented on A Framework for Characterizing Emergent Conflict Between Non-Coordinating Agents [pdf]   paperclipmaximizer.ai/Una... · Posted by u/ycombiredd
troelsSteegin · 6 months ago
I enjoyed a fast read of this and will re-read it. Following from the idea of making system-scoped information available to individual agents, I think a general model needs to include some sense of information processing capacity of agents - like an attention or analysis budget. That includes ability to recognize higher order signal as first order relevant. Information asymmetries are not so much about information but interpretation.
troelsSteegin commented on Teaching National Security Policy with AI   steveblank.com/2025/06/10... · Posted by u/enescakir
troelsSteegin · 6 months ago
What's missing from this is the "before and after" - how this quarter's class experience was different from previous quarters without the AI tool emphasis.

u/troelsSteegin

KarmaCake day753February 12, 2017View Original