Readit News logoReadit News
bcherry commented on Mercury 2: Fast reasoning LLM powered by diffusion   inceptionlabs.ai/blog/int... · Posted by u/fittingopposite
volodia · 18 days ago
Co-founder / Chief Scientist at Inception here. If helpful, I’m happy to answer technical questions about Mercury 2 or diffusion LMs more broadly.
bcherry · 18 days ago
you mention voice ai in the announcement but I wonder how this works in practice. most voice AI systems are bound not by full response latency but just by time-to-first-non-reasoning-token (because once it heads to TTS, the output speed is capped at the speed of speech and even the slowest models are generating tokens faster than that once they start going).

what do ttft numbers look like for mercury 2? I can see how at least compared to other reasoning models it could improve things quite a bit but i'm wondering if it really makes reasoning viable in voice given it seems total latency is still in single digit seconds, not hundreds of milliseconds

bcherry commented on The tech monoculture is finally breaking   jasonwillems.com/technolo... · Posted by u/at1as
bcherry · 2 months ago
This isn't really the author's point, but I think one effect of AI and the forthcoming robotics revolution will be the unrolling of a lot of consolidated supply chains for all sorts of products. It could usher in a renewed era of bespoke products.

For instance, when the cost of building a new (good) app goes to zero, it becomes economical to make a great app for a narrow niche, with a skeleton staff (maybe just one) and no VC money. And this can happen thousands of times over.

Robotics could open up bespoke local supply chains even beyond what's possible with a 3D printer today. For instance, if you had an actually dextrous humanoid robot "living" in your home, why wouldn't you have it just make all of your clothes? You could have any fabric, any style, exactly the right size. And only for the cost of materials (assuming you already own or lease the robot itself).

I do think the author is right in the big picture - the future will be more fun.

bcherry commented on A Fond Farewell   farmersalmanac.com/fond-f... · Posted by u/erhuve
shervinafshar · 4 months ago
Not to be confused with Old Farmer's Almanac (est. 1792) and yet sad to see a 200 years old periodical closing up shop.
bcherry · 4 months ago
wow thanks for leaving this comment - i now realize two things:

1. the farmer's almanac i thought of when i saw the title and even read the article is not going anywhere 2. i have never before heard of the farmer's almanac referred to in this notice

bcherry commented on PSF has withdrawn $1.5M proposal to US Government grant program   pyfound.blogspot.com/2025... · Posted by u/lumpa
actionfromafar · 4 months ago
Grants are terminated based on keyword matches.
bcherry · 4 months ago
they'd have to be extra careful with cpython, it's got a lot of include
bcherry commented on GPT-5   openai.com/gpt-5/... · Posted by u/rd
podgietaru · 7 months ago
I just don’t know that you’d name that 5.

The jump from 3 to 4 was huge. There was an expectation for similar outputs here.

Making it cheaper is a good goal - certainly - but they needed a huge marketing win too.

bcherry · 7 months ago
yeah i think they shot themselves in the foot a bit here by creating the o series. the truth is that GPT-5 _is_ a huge step forward, for the "GPT-x" models. The current GPT-x model was basically still 4o, with 4.1 available in some capacity. GPT-5 vs GPT-4o looks like a massive upgrade.

But it's only an incremental improvement over the existing o line. So people feel like the improvement from the current OpenAI SoTA isn't there to justify a whole bump. They probably should have just called o1 GPT-5 last year.

bcherry commented on Diffusion models explained simply   seangoedecke.com/diffusio... · Posted by u/onnnon
bcherry · 10 months ago
"The sculpture is already complete within the marble block, before I start my work. It is already there, I just have to chisel away the superfluous material."

- Michelangelo

bcherry commented on Chat is a bad UI pattern for development tools   danieldelaney.net/chat/... · Posted by u/cryptophreak
bcherry · a year ago
Chat is a great UX _around_ development tools. Imagine having a pair programmer and never being allowed to speak to them. You could only communicate by taking over the keyboard and editing the code. You'd never get anything done.

Chat is an awesome powerup for any serious tool you already have, so long as the entity on the other side of the chat has the agency to actually manipulate the tool alongside you as well.

bcherry commented on SimpleQA   openai.com/index/introduc... · Posted by u/surprisetalk
chaxor · a year ago
Also importantly, they do have a 'not attempted' or 'do not know' type of response, though how it is used is not really well discussed in the article.

As it has been for decades now, the 'Nan' type of answer in NLP is important, adds great capability, and is often glossed over.

bcherry · a year ago
a little glossed over, but they do point out that most important improvement o1 has over gpt-4o is not it's "correct" score improving from 38% to 42% but actually it's "not attempted" going from 1% to 9%. The improvement is even more stark for o1-mini vs gpt-4o-mini: 1% to 28%.

They don't really describe what "success" would look like but it seems to me like the primary goal is to minimize "incorrect", rather than to maximize "correct". the mini models would get there by maximizing "not attempted" with the larger models having much higher "correct". Then both model sizes could hopefully reach 90%+ "correct" when given access to external lookup tools.

bcherry commented on Australia/Lord_Howe is the weirdest timezone   ssoready.com/blog/enginee... · Posted by u/noleary
ak217 · a year ago
At some point the correct solution is for engineers to collectively agree to refuse to model government-prescribed deviations from convention. Or, put more obliquely, provide more feedback to make it more obvious who is bearing the cost of the complexity of these requirements.

It's a social problem and it calls for a social solution.

I know, there's a lot of disagreement around where the point in question is, but it would serve us well if more engineers were more assertive about stating their opinion on where it is.

bcherry · a year ago
disagree - good products meet their users where they are and bury complexity under the hood. i can't imagine trying to use a calendar app (or any app really) that refuses to operate in any mode other than UTC.
bcherry commented on Probably pay attention to tokenizers   cybernetist.com/2024/10/2... · Posted by u/ingve
bcherry · a year ago
It's kind of interesting because I think most people implementing RAG aren't even thinking about tokenization at all. They're thinking about embeddings:

1. chunk the corpus of data (various strategies but they're all somewhat intuitive)

2. compute embedding for each chunk

3. generate search query/queries

4. compute embedding for each query

5. rank corpus chunks by distance to query (vector search)

6. construct return values (e.g chunk + surrounding context, or whole doc, etc)

So this article really gets at the importance of a hidden, relatively mundane-feeling, operation that occurs which can have an outsized impact on the performance of the system. I do wish it had more concrete recommendations in the last section and code sample of a robust project with normalization, fine-tuning, and eval.

u/bcherry

KarmaCake day332September 20, 2010View Original