alach11 (u/alach11) - Readit News

alach11 commented on 65% of Hacker News posts have negative sentiment, and they outperform philippdubach.com/standal... · Posted by u/7777777phil

skerit · a month ago

Complaining is easy. And even when you complain, and someone comments to give another perspective that is not necessarily seen as a rebuke.

But posting something positive and getting slammed in the comments? That's depressing. So the barrier to posting something positive seems higher.

alach11 · a month ago

I believe Nat Friedman said "pessimists sound smart, optimists make money." It's certainly much easier to give a snarky/negative take and shoot an idea down than think creatively about how to make it work. Also, negative people are perceived as smarter!

https://www.sciencedirect.com/science/article/abs/pii/002210...

alach11 commented on A prediction market user made $436k betting on Maduro's downfall bbc.com/news/articles/cx2... · Posted by u/tartoran

alach11 · a month ago

I have to imagine governments are closely monitoring prediction markets as part of their intelligence apparatus. But then you just add another layer of subterfuge. Imagine a D-Day prediction market... "Will the Allies Land in Normandy, Pas-de-Calais, or somewhere else?" The US might buy a major position on Pas-de-Calais the night before as a decoy!

alach11 commented on Gemini 3 Flash: Frontier intelligence built for speed blog.google/products/gemi... · Posted by u/meetpateltech

alach11 · 2 months ago

I really wish these models were available via AWS or Azure. I understand strategically that this might not make sense for Google, but at a non-software-focused F500 company it would sure make it a lot easier to use Gemini.

alach11 commented on Advent of Code 2025 adventofcode.com/2025/abo... · Posted by u/vismit2000

saberience · 2 months ago

Agreed. There is no "beginner" or amateur programmer who could complete even part of a single Advent of Code problem.

alach11 · 2 months ago

Usually the first day or two are readily solvable in Excel with just regular spreadsheet formulas.

alach11 commented on Claude Opus 4.5 anthropic.com/news/claude... · Posted by u/adocomplete

jumploops · 3 months ago

> Pricing is now $5/$25 per million [input/output] tokens

So it’s 1/3 the price of Opus 4.1…

> [..] matches Sonnet 4.5’s best score on SWE-bench Verified, but uses 76% fewer output tokens

…and potentially uses a lot less tokens?

Excited to stress test this in Claude Code, looks like a great model on paper!

alach11 · 3 months ago

This is the biggest news of the announcement. Prior Opus models were strong, but the cost was a big limiter of usage. This price point still makes it a "premium" option, but isn't prohibitive.

Also increasingly it's becoming important to look at token usage rather than just token cost. They say Opus 4.5 (with high reasoning) used 50% fewer tokens than Sonnet 4.5. So you get a higher score on SWE-bench verified, you pay more per token, but you use fewer tokens and overall pay less!

Posted by u/alach11 3 months ago

Evals drive the next chapter in AI for businesses openai.com/index/evals-dr...

alach11 commented on Olmo 3: Charting a path through the model flow to lead open-source AI allenai.org/blog/olmo3... · Posted by u/mseri

nickreese · 3 months ago

I'm just now moving my main workflows off openai over to local models and I'm starting to find that these smaller models main failure mode is that they will accept edgecases with the goal of being helpful.

Especially in extraction tasks. This appears as inventing data or rationalizing around clear roadblocks.

My biggest hack so far is giving them an out named "edge_case" and telling them it is REALLY helpful if they identify edgecases. Simply renaming "fail_closed" or "dead_end" options to "edge_case" with helpful wording causes qwen models to adhere to their prompting more.

It feels like there are 100s of these small hacks that people have to have discovered... why isn't there a centralized place where people are recording these learnings?

alach11 · 3 months ago

Just curious - are you using Open WebUI or Librechat as a local frontend or are all your workflows just calling the models directly without UI?

alach11 commented on Gemini 3 blog.google/products/gemi... · Posted by u/preek

alach11 · 3 months ago

This is a really impressive release. It's probably the biggest lead we've seen from a model since the release of GPT-4. Seems likely that OpenAI rushed out GPT-5.1 to beat the Gemini 3 release, knowing that their model would underperform it.

alach11 commented on Disrupting the first reported AI-orchestrated cyber espionage campaign anthropic.com/news/disrup... · Posted by u/koakuma-chan

sodality2 · 3 months ago

I don’t think these agents are doing anything a dedicated human couldn’t do, only enabling it at scale. Relying on “not being one of few they focus on” as security is just security as obscurity. You were living on borrowed time anyway.

alach11 · 3 months ago

"Quantity has a quality all its own". It's categorically different to be able to do harm cheaply at scale vs. doing it at great cost/effort.

alach11 commented on You should write an agent fly.io/blog/everyone-writ... · Posted by u/tabletcorry

GoatInGrey · 3 months ago

So AI companies are profitable when you ignore some of the things they have to spend money on to operate?

Snark aside, inference is still being done at a loss. Anthropic, the most profitable AI vendor, is operating at a roughly -140% margin. xAI is the worst at somewhere around -3,600% margin.

alach11 · 3 months ago

Can you cite your source for inference being at a loss? This disagrees with most of what I've read.

u/alach11

KarmaCake day1595March 13, 2018

About

Reservoir engineer / data scientist / product manager at an oil & gas company in Houston, TX. You can contact me at <my Hacker News username>@gmail.com.

View Original