Readit News logoReadit News
whistle650 commented on Making LLMs Cheaper and Better via Performance-Efficiency Optimized Routing   arxiv.org/abs/2508.12631... · Posted by u/omarsar
whistle650 · 8 days ago
It seems they use 70% of the benchmark query-answer pairs to cluster and determine which models work best for each cluster (by sending all queries to all models and looking at responses vs ground truth answers). Then they route the remaining 30% "test" set queries according to those prior determinations. It doesn't seem surprising that this approach would give you Pareto efficiency on those benchmarks.
whistle650 commented on How three years at McKinsey shaped my second startup   blog.zactownsend.com/know... · Posted by u/zt
whistle650 · 4 months ago
Looking at the home page of Meanwhile only made me think of how life insurance is such a different thing than, say, a mortgage. With life insurance, counterparty risk matters. You don't care about your mortgage counterparty. I'm not going to buy life insurance from an insurer with Youtube videos of Anthony Pompliano on their home page. Know your enemy.
whistle650 commented on Gemini 2.5 Flash   developers.googleblog.com... · Posted by u/meetpateltech
arnaudsm · 4 months ago
Mostly brand recognition and the earlier Geminis had more refusals.

As a consumer, I also really miss the Advanced voice mode of ChatGPT, which is the most transformative tech in my daily life. It's the only frontier model with true audio-to-audio.

whistle650 · 4 months ago
Have you tried the Gemini Live audio-to-audio in the free Gemini iOS app? I find it feels far more natural than ChatGPT Advanced Voice Mode.
whistle650 commented on Ask HN: Physics PhD at Stanford or Berkeley    · Posted by u/Bang2Bay
whistle650 · 7 months ago
I don't know much about what it's like to do a PhD in physics at Berkeley, but many years ago I did a PhD in physics at Stanford starting out working in experimental quantum optics. I wound up doing something completely different, and felt supported in changing what I worked on. Stanford felt small in a good way, the grad student admin staff was wonderful. Stanford definitely has a different more suburban isolated vibe. Summers felt like you worked at a country club or something.

Who you work with really matters (obviously) and different PIs and labs can have very different cultures which you may or may not feel comfortable with. That alone can make your decision if you are very sure about what you want to do and who you want to work with.

Outside of that, I would say Stanford is a really great place to do graduate work, especially if you're not entirely sure what you want to do.

All of this is with the obvious caveat that my experience is from quite some time ago.

whistle650 commented on Classifying all of the pdfs on the internet   snats.xyz/pages/articles/... · Posted by u/Nydhal
whistle650 · a year ago
Interesting read with lots of good detail, thank you. A comment: if you are balancing the classes when you do one vs all binary training, and then use the max probability for inference, your probabilities might not be calibrated well, which could be a problem. Do you correct the probabilities before taking the argmax?
whistle650 commented on Newswire: A large-scale structured database of a century of historical news   arxiv.org/abs/2406.09490... · Posted by u/h2odragon
zX41ZdbW · a year ago
But the scan quality is subpar. Example:

> For belter safekecping Russta’s $2¢4,000,000 collection of crown jewels, probably (he finesl array of gems ever assem- bled at one tle

https://play.clickhouse.com/play?user=play#U0VMRUNUICogRlJPT...

whistle650 · a year ago
https://chatgpt.com/share/13f553a8-5cff-42a1-be95-4a9d33cd10...

May also be easy to correct a lot of it:

“For better safekeeping, Russia’s $24,000,000 collection of crown jewels, probably the finest array of gems ever assembled at one time,”

whistle650 commented on MLC-LLM: GPT/Llama on consumer-class GPUs and phones   github.com/mlc-ai/mlc-llm... · Posted by u/junrushao1994
whistle650 · 2 years ago
This is a great project thank you. I've installed the TestFlight app. FYI, right now it's saying in response to "Who was the president in 1973" that it was "Gerald Ford" which is wrong.
whistle650 commented on Diagnosing cancer by profiling the immune system   github.com/jostmey/msm... · Posted by u/jostmey
marymkearney · 2 years ago
This is really cool. Thanks for sharing. It validates the perspective of the late great Pieter Hintjens on this topic. https://github.com/hintjens/confessions/blob/master/ch08.txt

"In my body right now there is a holy war going on, and has been raging for years. My immune system has been doing its damned best to kill these rogue cells. And the rogue cells, unaware that they're destroying their own host, have been fighting back.

"The odds are on the cancer, of course, which is why this family of diseases is a major killer. Our bodies have to keep winning, year after year. Any given cancer has to win only once, and it's Game Over. The only way to beat cancer, really, is to die from something else first.

"Everyone fights cancer, all our lives long. From birth, our immune systems are hunting down and killing rogue cells.... We are all cancer survivors, until we're not. "

whistle650 · 2 years ago
Wow, thank you for sharing that link. It was rational and humane and I found it very moving. I didn't know who he was, but I'm glad to know of him and his work now.
whistle650 commented on GPT Unicorn: A Daily Exploration of GPT-4's Image Generation Capabilities   adamkdean.co.uk/posts/gpt... · Posted by u/imdsm
simonw · 2 years ago
I filed an issue: https://github.com/adamkdean/gpt-unicorn/issues/2

"Running this project daily doesn't make sense if GPT-4 is not being constantly updated"

With a suggestion to run it monthly instead, and generate 16 images at a time, and backfill it for GPT3 and GPT3.5.

whistle650 · 2 years ago
In the talk he specifically mentions the very interesting fact that as they improved "alignment" it affected the unicorn output (negatively if I remember correctly). So as long as "alignment" is changing, the output should change. Not sure but ongoing RLHF, changing "system" prompts etc can and do change while the underlying foundational model need not.
whistle650 commented on Auto-GPT Unmasked: The Hype and Hard Truths of Its Production Pitfalls   jina.ai/news/auto-gpt-unm... · Posted by u/artex_xh
whistle650 · 2 years ago
It is a strange article, speaking as if AutoGPT were not completely nascent, which it is. So the critiques aren't even really wrong. The most valuable observation is that vector DBs are overkill (does LangChain have a stake in Pinecone?)

u/whistle650

KarmaCake day49May 16, 2018View Original