Readit News logoReadit News
neilv · a year ago
> Will Intel share specific manufacturing dates and serial number ranges for the oxidized processors so mission-critical businesses can selectively rip and replace?

> Intel will continue working with its customers on Via Oxidation-related reports and ensure that they are fully supported in the exchange process.

Intel is refusing to disclose serial number ranges of the fundamentally defective processors?

Followup question: How do owners of that series of CPU, who suspect theirs is one of the defective units, exchange it for a non-defective CPU before it fails?

reisse · a year ago
> Intel is refusing to disclose serial number ranges of the fundamentally defective processors?

I bet they all are bad. Intel just hopes failure rate is low enough to RMA instead of recall.

> How do owners of that series of CPU, who suspect theirs is one of the defective units, exchange it for a non-defective CPU before it fails?

Another bet: class action suits.

Bluecobra · a year ago
Something similar happened a few years back with the Atom CPUs. The downside is that these were typically found soldered on expensive devices like Cisco routers, firewalls, etc. The company I was working at the time had to RMA a ton of devices that could be faulty. I thought for sure that this would result in earnings hits and lawsuits but none of that seemed to happen.

https://www.servethehome.com/intel-atom-c2000-series-bug-qui...

CoastalCoder · a year ago
Man, this really does seem like a replay of their FDIV mess (IIRC).

I was really hoping that Pat Gelsinger would restore some of Intel's respectability. This gives me doubt.

code_biologist · a year ago
I had high hopes for Pat Gelsinger too. I suspect most management expects the consequences of short-sighted corporate culture to manifest in products as maybe a 1% failure rate within 90 days of sale, not a 25%+ failure rate slowly manifesting over more than a year. The former is the cost of doing business, the latter is an existential problem. I suspect Gelsinger didn't know about it and lower management buried it thinking it was a <1% problem, and he's only found out about it as it's become clear what a massive problem it is.

All just speculation though.

isthatafact · a year ago
When Gelsinger promised (in 2021) zettaflop systems by 2027, although there is still time, it seemed so absurdly optimistic that it is hard to trust anything he says.

However, I think people really want to give him a chance on his long term plans to make Intel competitive again.

jpatters · a year ago
Interesting. I bought a 13900 (non-k) in mid April this year for a new server build. It ran fine for a couple of weeks and then started randomly crashing. Having never had a cpu go bad on me before and not having another one laying around to test with, it took me a long time to figure out what the issue was. Finally, by the end of May, I had ruled everything else out and RMA’d it. The system has been running fine ever since.

I assumed I had just got a bad unit. Now I’m wondering if this might have been the cause.

xyst · a year ago
Prediction: Intel is stalling the recall until after the earnings report to avoid tanking the stock.

Glad I switched to using AMD. Although some RUMINT indicating quality assurance troubles in 9000 series though. They were supposed to push out new product by end of July. But delayed to mid August.

harshreality · a year ago
Is there any RUMINT about specifically why it's delayed? The delay itself is not RUMINT, and it's obviously some QA issue, but beyond that?

The only rumors I've seen are generic guesses: something AMD wasn't screening for, maybe coincidence, or maybe detected after the Intel mess sent AMD's QA teams scurrying to make sure they don't have any similar issues.

It could even be a combined issue with 3rd-party motherboards with AMD's new chipsets, the combination of which they wouldn't have been able to thoroughly test much earlier.

Isn't RUMINT on the Intel problem that, even though it's nominally a chip problem, it may occur primarily due to motherboards not following guidance? For instance, if a spec says the chip and mobo should lower max voltage by 100mV under certain conditions, but the chip still sometimes requests the full original voltage under those conditions, and the mobo provides it, whose fault is that? Maybe not exclusively Intel's, depending on how the specs are documented (a classic should vs must issue).

It seems likely that small process sizes and pushing the limits of performance are going to cause more problems like this Intel one. Notice Intel didn't have this problem before they had to push their chips to compete against the high-end Ryzens.

dave7 · a year ago
> Isn't RUMINT on the Intel problem that, even though it's nominally a chip problem, it may occur primarily due to motherboards not following guidance?

That was merely Intel's first attempt at deflection / damage control. Kernel of truth to it, of course.

Numerlor · a year ago
There hasn't been anything official but there were rumors about some reviewers got unexpectedly bad performance that AMD narrowed down to bad packaging for the SOC die.

If it is the SOC it doesn't have anything to do with pushing nodes as it's on a larger node and the soc itself should be a mature architecture because it's reused from zen 4

yread · a year ago
It's obviously some bullshit as 1 month doesn't give enough time to change anything meaningful
AnthonyMouse · a year ago
> Glad I switched to using AMD. Although some RUMINT indicating quality assurance troubles in 9000 series though. They were supposed to push out new product by end of July. But delayed to mid August.

I had a theory that the "quality issues" are a marketing ploy (see? we don't send customers bad products, unlike those other guys) combined with an excuse to delay the release date (and thus the review embargo date) until after the proposed Intel microcode release, since the updated microcode was expected to negatively impact Intel in the performance comparison.

ein0p · a year ago
AMD has its issues too, from time to time. I even replaced a processor through them once. As chip complexity increases, both companies are going to have these issues more often. The solution is to let other people be guinea pigs for a year.
abracadaniel · a year ago
The headline isn't clear, but the claim so far is that the microcode update will fix any CPUs that haven't begun exhibiting instability. Nothing can fix the ones that are already broken. For those that are somewhere in between, hope it fails within the warranty period I guess.
mrtksn · a year ago
The way I understand it, they had a bug that will run the CPU on higher voltage than the hardware can tolerate.

Those who pushed the limits physically damaged their CPU and these are now cooked. The microcode update will limit the voltage, which will result in degraded performance but will prevent damage under load.

code_biologist · a year ago
Those who pushed the limits physically

You mean everyone who owns a motherboard from a manufacturer that was given unclear guidance on power delivery from Intel, but also encouraged to make sure their boards benchmarked competitively (by providing enough voltage for clock boosts)? That's pretty much every enthusiast.

hypercube33 · a year ago
GN was saying people saw the bugs on chips not pushed at all on workstation boards though.
Dunedan · a year ago
This article is just quoting the original source of this information, which would be a much better link: https://www.theverge.com/2024/7/26/24206529/intel-13th-14th-...
MaximilianEmel · a year ago
I've only ever bought Intel, mostly because I perceived them to be more stable and reliable. I think next time, I will give AMD a try.
mereo · a year ago
I've been building my systems since 1999 and I've been pragmatic. Buying what was best at the time. I never had stability problems with both AMD and Intel.
Havoc · a year ago
Given that some reports put failure rates as high as 50% for some models/conditions this may as well be a recall.
Arrath · a year ago
Would almost think a recall would be easier to handle that hundreds of thousands/millions of individual RMA/warranty claims.
mystified5016 · a year ago
They're hoping a large fraction of people won't bother with the RMA process and will just shut up and go away.

Unfortunately for Intel, "away" here means "to buy an AMD processor"

cyanydeez · a year ago
Recall likely affects shareholders so, you know, waste more time and money to ensure your corporate profits
qingcharles · a year ago
How would a recall work, given that half of these chips are soldered onto mobos, not socketed?