Readit News logoReadit News
alach11 commented on 65% of Hacker News posts have negative sentiment, and they outperform   philippdubach.com/standal... · Posted by u/7777777phil
skerit · a month ago
Complaining is easy. And even when you complain, and someone comments to give another perspective that is not necessarily seen as a rebuke.

But posting something positive and getting slammed in the comments? That's depressing. So the barrier to posting something positive seems higher.

alach11 · a month ago
I believe Nat Friedman said "pessimists sound smart, optimists make money." It's certainly much easier to give a snarky/negative take and shoot an idea down than think creatively about how to make it work. Also, negative people are perceived as smarter!

https://www.sciencedirect.com/science/article/abs/pii/002210...

alach11 commented on A prediction market user made $436k betting on Maduro's downfall   bbc.com/news/articles/cx2... · Posted by u/tartoran
alach11 · a month ago
I have to imagine governments are closely monitoring prediction markets as part of their intelligence apparatus. But then you just add another layer of subterfuge. Imagine a D-Day prediction market... "Will the Allies Land in Normandy, Pas-de-Calais, or somewhere else?" The US might buy a major position on Pas-de-Calais the night before as a decoy!
alach11 commented on Gemini 3 Flash: Frontier intelligence built for speed   blog.google/products/gemi... · Posted by u/meetpateltech
alach11 · 2 months ago
I really wish these models were available via AWS or Azure. I understand strategically that this might not make sense for Google, but at a non-software-focused F500 company it would sure make it a lot easier to use Gemini.
alach11 commented on Advent of Code 2025   adventofcode.com/2025/abo... · Posted by u/vismit2000
saberience · 2 months ago
Agreed. There is no "beginner" or amateur programmer who could complete even part of a single Advent of Code problem.
alach11 · 2 months ago
Usually the first day or two are readily solvable in Excel with just regular spreadsheet formulas.
alach11 commented on Claude Opus 4.5   anthropic.com/news/claude... · Posted by u/adocomplete
jumploops · 3 months ago
> Pricing is now $5/$25 per million [input/output] tokens

So it’s 1/3 the price of Opus 4.1…

> [..] matches Sonnet 4.5’s best score on SWE-bench Verified, but uses 76% fewer output tokens

…and potentially uses a lot less tokens?

Excited to stress test this in Claude Code, looks like a great model on paper!

alach11 · 3 months ago
This is the biggest news of the announcement. Prior Opus models were strong, but the cost was a big limiter of usage. This price point still makes it a "premium" option, but isn't prohibitive.

Also increasingly it's becoming important to look at token usage rather than just token cost. They say Opus 4.5 (with high reasoning) used 50% fewer tokens than Sonnet 4.5. So you get a higher score on SWE-bench verified, you pay more per token, but you use fewer tokens and overall pay less!

alach11 commented on Olmo 3: Charting a path through the model flow to lead open-source AI   allenai.org/blog/olmo3... · Posted by u/mseri
nickreese · 3 months ago
I'm just now moving my main workflows off openai over to local models and I'm starting to find that these smaller models main failure mode is that they will accept edgecases with the goal of being helpful.

Especially in extraction tasks. This appears as inventing data or rationalizing around clear roadblocks.

My biggest hack so far is giving them an out named "edge_case" and telling them it is REALLY helpful if they identify edgecases. Simply renaming "fail_closed" or "dead_end" options to "edge_case" with helpful wording causes qwen models to adhere to their prompting more.

It feels like there are 100s of these small hacks that people have to have discovered... why isn't there a centralized place where people are recording these learnings?

alach11 · 3 months ago
Just curious - are you using Open WebUI or Librechat as a local frontend or are all your workflows just calling the models directly without UI?
alach11 commented on Gemini 3   blog.google/products/gemi... · Posted by u/preek
alach11 · 3 months ago
This is a really impressive release. It's probably the biggest lead we've seen from a model since the release of GPT-4. Seems likely that OpenAI rushed out GPT-5.1 to beat the Gemini 3 release, knowing that their model would underperform it.
alach11 commented on Disrupting the first reported AI-orchestrated cyber espionage campaign   anthropic.com/news/disrup... · Posted by u/koakuma-chan
sodality2 · 3 months ago
I don’t think these agents are doing anything a dedicated human couldn’t do, only enabling it at scale. Relying on “not being one of few they focus on” as security is just security as obscurity. You were living on borrowed time anyway.
alach11 · 3 months ago
"Quantity has a quality all its own". It's categorically different to be able to do harm cheaply at scale vs. doing it at great cost/effort.
alach11 commented on You should write an agent   fly.io/blog/everyone-writ... · Posted by u/tabletcorry
GoatInGrey · 3 months ago
So AI companies are profitable when you ignore some of the things they have to spend money on to operate?

Snark aside, inference is still being done at a loss. Anthropic, the most profitable AI vendor, is operating at a roughly -140% margin. xAI is the worst at somewhere around -3,600% margin.

alach11 · 3 months ago
Can you cite your source for inference being at a loss? This disagrees with most of what I've read.

u/alach11

KarmaCake day1595March 13, 2018
About
Reservoir engineer / data scientist / product manager at an oil & gas company in Houston, TX. You can contact me at <my Hacker News username>@gmail.com.
View Original