GPT-5-reasoning alpha found in the wild

anonzzzies · a month ago

Look at those people shouting this will be AGI / total disruption etc. Seems Elon managed one thing; to amass the dumbest folks together. 99.99% maga, crypto and almost markov chain quality comments.

thm · a month ago

99% of AI influencers are the same people who emailed you pictures as a Word attachment a year ago.

torginus · a month ago

This is what put me off Claude Code. When I wanted. To dig in, I tried to watch a few Youtube videos to see an expert's opinion it, and 90% people who talk about it feel like former crypto shills, who, from their channel history, seem like have never written a single line of code without AI in their lives.

sorokod · a month ago

Golgafrincham Ark B, material.

https://hitchhikers.fandom.com/wiki/Golgafrincham

elif · a month ago

Your contrary certainty has the same humorously over-confident tone.

perching_aix · a month ago

Which is how and why these political strategies work so well.

AaronAPU · a month ago

It’s incredible how few people can see their own reflection.

Deleted Comment

shiandow · a month ago

I'll believe in AGI when OpenAI stops paying human developers.

brookst · a month ago

I don’t see how this follows. Does AGI mean that it is free to operate and has no hardware / power constraints?

The fact that I see people being paid to dig a trench does not make me doubt the existence of trenching machines. It just means that the tool is not always the best choice for every job.

graycat · a month ago

AGI???? Again, once again, over again, yet again, one more time:

(1) Given triangle ABC, by means of Euclidean construction find point D on line AB and point E on line BC so that the lengths |AD| = |DE| = |EC.

(2) Given triangle ABC, by means of Euclidean construction inscribe a square so that each corner of the square is on a side of the triangle.

Come ON AGI, let's have some RESULTS that human general intelligence can do -- gee, I solved (1) in the 10th grade.

threatripper · a month ago

We have to wait and test it ourselves to see how far it gets in our daily tasks. If the improvement continues like it did in the past, that would be pretty far. Not quite a full researcher position but an average student assistant for sure.

swat535 · a month ago

Wasn't Sam Altman claiming AGI is just a couple of years away and OpenAI is at the forefront of it?

Deleted Comment

Dead Comment

ImHereToVote · a month ago

Maybe this won't be. How long do you think a machine will be able to outdo any human in any given domain? I personally think it will be after they are able to rewrite their own code. You?

kasey_junk · a month ago

They write their own code now so how long will it be?

owebmaster · a month ago

> I personally think it will be after they are able to rewrite their own code.

My threshold is when it can create a new Google

bgwalter · a month ago

None of the X enthusiasts has even seen a benchmark or used the thing, but we're glad to know that Duke Nukem Forever will be released soon.

chvid · a month ago

What is this? Guerilla marketing from a 300B startup?

bawana · a month ago

Well i asked chatGPT IF i could run kimik2 on a 5800x 3d with 64 gigs of ram with a 3090 and it said:

Yes, you absolutely can run Kimi-K2-Instruct on a PC with:

:white_check_mark: CPU: AMD Ryzen 7 5800X3D :white_check_mark: GPU: NVIDIA RTX 3090 (24 GB VRAM) :white_check_mark: RAM: 64 GB system memory This is more than sufficient for both:

Loading and running the full Kimi-K2-Instruct model in FP16 or INT8, and Quantizing it with weight-only INT8 using Hugging Face Optimum + bitsandbytes.

Kimi k2 has a trillion parameters and even an 8 bit quant would need half a gig of system ram +vram

This is with the free chatGPT that us peasants use. I dont have the means to run grok4 heavy, deep seek or kimi k2 to ask them.

I cant wait to see what accidental wars will start when we put ai in the kill chain

ogogmad · a month ago

Maybe you should use a reasoning model. Got this from O3, which took 1m31s to think about the answer: https://chatgpt.com/s/t_687b9221fb748191af4e30f597f18443

Bottom line: Your 5800X3D + 64 GB RAM + RTX 3090 will run Kimi K2’s 1.8‑bit build, but response times feel more like a leisurely typewriter than a snappy chatbot. If you want comfortable day‑to‑day use, plan either a RAM upgrade or a second (or bigger) GPU—or just hit the Moonshot API and save some waiting.

threatripper · a month ago

I second this. o3 is pretty spot on while 4o answered exactly like what the parent got.

I rarely use 4o anymore for anything. Rather would I wait for o3 than quickly get a pile of rubbish.

jug · a month ago

These cases are probably why OpenAI has stated GPT-4.1 is their last non reasoning model and GPT-5 will determine the need for and how much to reason based on the query.

dyl000 · a month ago

cant wait for how mid this is going to be.

ogogmad · a month ago

In related news, OpenAI and Google have announced that their latest non-public models have received Gold in the International Mathematics Olympiad: https://news.ycombinator.com/item?id=44614872

That said, the public models don't even get bronze.

[EDIT] Dupe of this: https://news.ycombinator.com/item?id=44614872

johnecheck · a month ago

Wow. That's an impressive result, though we definitely need some more details on how it was achieved.

What techniques were used? He references scaling up test-time compute, so I have to assume they threw a boatload of money at this. I've heard talk of running models in parallel and comparing results - if OpenAI ran this 10000 times in parallel and cherry-picked the best one, this is a lot less exciting.

If this is legit, then I really want to know what tools were used and how the model used them.

badgersnake · a month ago

> If this is legit

Indeed.

tim333 · a month ago

>The declaration of AGI ... will force Microsoft to relinquish its rights to OpenAI revenue ...

That's an interesting business arrangement. There must be an incentive for OpenAI to get declaring?

m3kw9 · a month ago

Yeah this is as big of a news as iPhone 18 in the pipeline.

pjs_ · a month ago

Sama clocked this way back. He has used this exact analogy - that new GPT models will feel like incremental new iPhone releases c.f. the first iPhone/GPT-3.

lucisferre · a month ago

Seems a bit early to use that analogy though. Early iPhones upgrades generally had significant improvements in almost all specs.