islewis (u/islewis) - Readit News

islewis commented on 1M context is now generally available for Opus 4.6 and Sonnet 4.6 claude.com/blog/1m-contex... · Posted by u/meetpateltech

dimitri-vs · 2 days ago

The big change here is:

> Standard pricing now applies across the full 1M window for both models, with no long-context premium. Media limits expand to 600 images or PDF pages.

For Claude Code users this is huge - assuming coherence remains strong past 200k tok.

islewis · 2 days ago

The quality with the 1M window has been very poor for me, specifically for coding tasks. It constantly forgets stuff that has happened in the existing conversation. n=1, ymmv

islewis commented on ChatGPT Developer Mode: Full MCP client access platform.openai.com/docs/... · Posted by u/meetpateltech

islewis · 6 months ago

> It's powerful but dangerous, and is intended for developers who understand how to safely configure and test connectors.

So... practically no one? My experience has been that almost everyone testing these cutting edge AI tools as they come out are more interested in new tool shinyness than safety or security.

islewis commented on Meta invests $14.3B in Scale AI to kick-start superintelligence lab nytimes.com/2025/06/12/te... · Posted by u/RyanShook

islewis · 9 months ago

This appears to be a psuedo-acquisition, but with a strange format to appease regulators.

Will this still be an exit event for employees or do they get screwed here?

islewis commented on DeepSeek-Prover-V2 github.com/deepseek-ai/De... · Posted by u/meetpateltech

islewis · 10 months ago

> The cold-start training procedure begins by prompting DeepSeek-V3 to decompose complex problems into a series of subgoals

It feels pretty intuitive to me that the ability for an LLM to break a complex problem down into smaller, more easily solvable pieces will unlock the next level of complexity.

This pattern feels like a technique often taught to junior engineers- how to break up a multi-week project into bitesized tasks. This model is obviously math focused, but I see no reason why this wouldn't be incredibly powerful for code based problem solving.

islewis commented on Google contract prevented Motorola from setting Perplexity as default assistant bloomberg.com/news/articl... · Posted by u/welpandthen

bearjaws · a year ago

Why wouldn't they?

1. Already in anti-trust related to ads, AI is probably in the clear.

2. If they are thought to violating a law they will get like a $10,000,000 fine and pay it, still less money than they will make from harvesting data.

islewis · a year ago

> Already in anti-trust related to ads, AI is probably in the clear.

"Already in trouble for committing monopolist behavior in market A, Google should be fine committing even more monopolist behavior in the very related and overlapping market of B"

This makes claim makes pretty little sense to me. AI search and Google web search (ads) are already stepping on each other. I see no reason that Google wouldn't be worried about antitrust on AI search if they're worried about antitrust action in general- which they clearly are.

islewis commented on Supabase raises $200M Series D at $2B valuation finance.yahoo.com/news/ex... · Posted by u/baristaGeek

tobr · a year ago

You make it sound like it’s a binary situation where in one case it doesn’t matter and in the other it’s No Problem. I think both are wrong.

An unsuccessful project might be unsuccessful because it got eaten by costs before it became successful.

A wildly successful project is risky to migrate.

islewis · a year ago

If you are trying to commercialize something, a popular project with bad margins is a better spot to be in than an unsuccessful project with good margins. If it's a personal learning project, that might not be the case.

islewis commented on Reasoning models don't always say what they think anthropic.com/research/re... · Posted by u/meetpateltech

islewis · a year ago

> For the purposes of this experiment, though, we taught the models to reward hack [...] in this case rewarded the models for choosing the wrong answers that accorded with the hints.

> This is concerning because it suggests that, should an AI system find hacks, bugs, or shortcuts in a task, we wouldn’t be able to rely on their Chain-of-Thought to check whether they’re cheating or genuinely completing the task at hand.

As a non-expert in this field, I fail to see why a RL model taking advantage of it's reward is "concerning". My understanding is that the only difference between a good model and a reward-hacking model is if the end behavior aligns with human preference or not.

The articles TL:DR reads to me as "We trained the model to behave badly, and it then behaved badly". I don't know if i'm missing something, or if calling this concerning might be a little bit sensationalist.

islewis commented on MLB says Yankees’ new “torpedo bats” are legal and likely coming thelibertyline.com/2025/0... · Posted by u/cf100clunk

islewis · a year ago

I've always wondered what the technological development of F1 would look like in other sports. This feels pretty close.

Deleted Comment

islewis commented on I genuinely don't understand why some people are still bullish about LLMs twitter.com/skdh/status/1... · Posted by u/ksec

islewis · a year ago

> I genuinely don't understand why some people are still bullish about LLMs.

I don't believe OP's thesis is properly backed by the rest of his tweet, which seems to boil down to "LLM's can't properly cite links".

If LLM's performing poorly on an arbitrary small-scoped test case makes you bearish on the whole field, I don't think that falls on the LLM's.

u/islewis

KarmaCake day255April 6, 2023View Original