Readit News logoReadit News
mscbuck commented on What the heck is going on at Apple?   cnn.com/2025/12/06/tech/a... · Posted by u/methuselah_in
keyle · 15 days ago
Personally I prefer a useless Siri that I can turn off, rather than a copilot into everything I cannot turn off.

I am perfectly fine with Apple lagging behind in "AI".

mscbuck · 14 days ago
I still feel like they are in an incredible position when it comes to AI because of their hardware integration/advantage across all of their devices. I think they see immense value in getting things on-device and not having to rely on any of these other companies.
mscbuck commented on R packages for data science   tidyverse.org/... · Posted by u/cl3misch
svoit · 19 days ago
The Tidyverse is solid. I sometimes wish I used R more in industry because of how good it is.

IMO, R is kind of a syntactic Frankenstein otherwise.

Tidymodels also exists: https://www.tidymodels.org/

mscbuck · 19 days ago
From what I saw in the latest "language" surveys or whatever, R does seemingly seem to be making a slight comeback. I was actually surprised at its place above Ruby, iirc. Again, not that these surveys are the end-all-be-all, but I've also started to see a lot more data science postings that have R or Python be a requirement, where I feel like for a few years it was ALL Python.
mscbuck commented on Claude Opus 4.5   anthropic.com/news/claude... · Posted by u/adocomplete
stavros · a month ago
Did anyone else notice Sonnet 4.5 being much dumber recently? I tried it today and it was really struggling with some very simple CSS on a 100-line self-contained HTML page. This never used to happen before, and now I'm wondering if this release has something to do with it.

On-topic, I love the fact that Opus is now three times cheaper. I hope it's available in Claude Code with the Pro subscription.

EDIT: Apparently it's not available in Claude Code with the Pro subscription, but you can add funds to your Claude wallet and use Opus with pay-as-you-go. This is going to be really nice to use Opus for planning and Sonnet for implementation with the Pro subscription.

However, I noticed that the previously-there option of "use Opus for planning and Sonnet for implementation" isn't there in Claude Code with this setup any more. Hopefully they'll implement it soon, as that would be the best of both worlds.

EDIT 2: Apparently you can use `/model opusplan` to get Opus in planning mode. However, it says "Uses your extra balance", and it's not clear whether it means it uses the balance just in planning mode, or also in execution mode. I don't want it to use my balance when I've got a subscription, I'll have to try it and see.

EDIT 3: It looks like Sonnet also consumes credits in this mode. I had it make some simple CSS changes to a single HTML file with Opusplan, and it cost me $0.95 (way too much, in my opinion). I'll try manually switching between Opus for the plan and regular Sonnet for the next test.

mscbuck · a month ago
Yes, I've absolutely noticed this. I feel like I can always tell when something is up when it starts trying to do WAY more things than normal. Like I can give it a few functions and ask for some updates, and it just goes through like 6 rounds of thinking, creating 6 new files, assuming that I want to write changes to a database, etc.
mscbuck commented on Show HN: I scraped 3B Goodreads reviews to train a better recommendation model   book.sv... · Posted by u/costco
mscbuck · 2 months ago
Awesome site and speed!

My advice from someone who has built recommendation systems: Now comes the hard part! It seems like a lot of the feedback here is that it's operating pretty heavily like a content based system system, which is fine. But this is where you can probably start evaluating on other metrics like serendipity, novelty, etc. One of the best things I did for recommender systems in production is having different ones for different purposes, then aggregating them together into a final. Have a heavy content-based one to keep people in the rabbit hole. Have a heavy graph based to try and traverse and find new stuff. Have one that is heavily tuned on a specific metric for a specific purpose. Hell, throw in a pure TF-IDF/BM25/Splade based one.

The real trick of rec systems is that people want to be recommnded things differently. Having multiple systems that you can weigh differently per user is one way to be able to achieve that, usually one algorithm can't quite do that effectively.

mscbuck commented on Show HN: Erdos – open-source, AI data science IDE   lotas.ai/erdos... · Posted by u/jorgeoguerra
Centigonal · 2 months ago
This is a good idea, although IMO source control, compute, and MLOps integration are bigger but less flashy pain points for data scientists than AI in notebooks.

If you're going to market Erdos as open source, then IMO there should be a github link somewhere on your website.

mscbuck · 2 months ago
Will echo that one thing that would prevent me from trying this is def the source control. Otherwise it does look pretty slick!
mscbuck commented on Claude Memory   anthropic.com/news/memory... · Posted by u/doppp
lukol · 2 months ago
Anybody else experiencing severe decline in Claude output quality since the introduction of "skills"?

Like Claude not being able to generate simple markdown text anymore and instead almost jumping into writing a script to produce a file of type X or Y - and then usually failing at that?

mscbuck · 2 months ago
I have also anecdotally noticed it starting to do things consistently that it never used to do. One thing in particular was that even while working on a project where it knows I use OpenAI/Claude/Grok interchangeably through their APIs for fallback reasons, and knew that for my particular purpose, OpenAI was the default, it started forcing Claude into EVERYTHING. That's not necessarily surprising to me, but it had honestly never been an issue when I presented code to it that was by default using GPT.
mscbuck commented on The AI bubble is 17 times the size of the dot-com frenzy, analyst says   marketwatch.com/story/the... · Posted by u/CharlesW
idkwhattocallme · 3 months ago
The bubble bursts when Apple announces it's doing good enough (private/secure) LLMs on device. At that point the capex on cloud infra starts to come into question and the dominos start to fall...
mscbuck · 3 months ago
As laughable as Apple's efforts have been so far, I think they still have an advantage precisely because of the unified architecture.
mscbuck commented on The RAG Obituary: Killed by agents, buried by context windows   nicolasbustamante.com/p/t... · Posted by u/nbstme
intalentive · 3 months ago
Agentic search with a handful of basic tools (drawn from BM25, semantic search, tags, SQL, knowledge graph, and a handful of custom retrieval functions) blows the lid off RAG in my experience. The downside is it takes longer. A single “investigation” can easily use 20-30 different function calls. RAG is like a static one-shot version of this and while the results are inferior the process is also a lot faster.
mscbuck · 3 months ago
I've found his hybrid approach pretty good for the majority of use cases. BM25 (maybe Splade if you want a blend of BOW/Keyword), + Vectors + RRF + re-rank works pretty damn well.

The trick that has elevated RAG, at least for my use cases, has been having different representations of your documents, as well as sending multiple permutations of the input query. Do as much as you can in the VectorDB for speed. I'll sometimes have 10-11 different "batched" calls to our vectorDB that are lightning quick. Then also being smart about what payloads I'm actually pulling so that if I do use the LLM to re-rank in the end, I'm not blowing up the context.

TLDR: Yes, you actually do have to put in significant work to build an efficient RAG pipeline, but that's fine and probably should be expected. And I don't think we are in a world yet where we can just "assume" that large context windows will be viable for really precise work, or that costs will drop to 0 anytime soon for those context windows.

mscbuck commented on Sora 2   openai.com/index/sora-2/... · Posted by u/skilled
bonoboTP · 3 months ago
This is not the final target. It's video generation now, but that's just a stepping stone. The real thing is that learning a generator is also learning a prior over videos, and hence over how the world works. The real application of this will be word models, vision-language action models, spatial AI and robotics. Basically a kind of learned simulator in which to plan and imagine possible futures, possible actions and affordances etc. Video models could become a spatial reasoning platform too. A recent paper by deepmind (using veo3) showed that video models can perform many high level vision tasks out of the box.

Don't think it's going to end here at some slop feed.

mscbuck · 3 months ago
I think generally I agree with you that this is a stepping stone towards bigger/potentially more important things......but that doesn't change the fact that they've packaged it to consumers as something that seems like it has, at best, close to zero utility and at worst has incredible downsides. I'm not sure why releasing this to consumers helps achieve those goals.
mscbuck commented on Sora 2   openai.com/index/sora-2/... · Posted by u/skilled
mscbuck · 3 months ago
I can't help but see these technologies and think of Jeff Goldblum in Jurassic Park.

My boss sends me complete AI Workslop made with these tools and he goes "Look how wild this is! This is the future" or sends me a youtube video with less than a thousand views of a guy who created UGC with Telegram and point and click tools.

I don't ever think he ever takes a beat, looks at the end product, and asks himself, "who is this for? Who even wants this?", and that's aside from the fact that I still think there are so many obvious tells with this content that make you know right away that it is AI.

u/mscbuck

KarmaCake day71August 12, 2024View Original