Love reading these highly detailed analyses. Short version: Zhaoxin's currently competitive with 2010/2011-era AMD and Intel, with some asterisks around RAM speed.
There is to my mind a sort of race to get up to "fast enough to host H100 competitor AI hardware" with non-US IP that makes sense to engage in. In those terms, it looks like they're maybe 2 revs away -- I'm not sure what process node the KX7000 is on, but there's some architectural work to finish up. That said, this is interesting. I assume the chips will continue to improve from Zhaoxin, unless they lose their core team.
That’s pretty easily verified by a reasonably competent chip engineer and a microscope, if its 16 and they need to stay on shore only, they could be a little closer in terms of time than I speculated. SMIC at 7nm is reportedly stable; not sure about yields though.
I think if I were in charge I’d probably do a final architecture spin at 16, and then shrink that one to 7 or 5 if I could get it.
If the goal is to make a CPU that is "fast enough to host H100 competitor AI hardware", then why bother with x64? Huawei could have just produced a powerful ARM chip to go with their new AI processor. After all, Nvidia GB200 also uses Arm-based CPUs (the Grace in GB).
This review is an object lesson about why there is so much more to shipping a decent processor than making a CPU core with reasonable performance (and decent is being polite given that we are talking about Bulldozer-class single-threaded perf, which most folks were beyond thrilled to abandon when Zen arrived eight years ago.)
The behavior of the memory controller is wild to see in this day and age. You really don't want to see latency that high in general, but especially not for a client processor. I'd really like to see how it behaves with a reasonably powerful GPU in a CPU-bound gaming workload relative to the competition (to simulate what one of these might see in an internet café setting, for instance).
Power efficiency also seems truly dismal according to PCWatch: https://pc.watch.impress.co.jp/docs/column/hothot/1626253.ht... . In Cinebench MT, it's consuming about the same power as a Ryzen 5 5600G while delivering about 1/3 the performance, and the idle power is much higher than the Core i3-8100/R5 5600G to boot. That's not a huge issue for desktops, but it would not make a good foundation for a mobile system.
Overall an improvement versus past Zhaoxin efforts but people shouldn't kid themselves about the quality of the overall package here. There is a long way to go.
It does clock ramp from 800 MHz idle to 3.2 GHz under load, with 900, 1000, 1100, 1300, 1500, 1800, 2200, and 2700 MHz steps in between until it hits 3.2 GHz after 71.6 ms. Article was getting long enough so I just left it at, it reaches 3.2 GHz and stays there even though the spec sheet says it should go higher.
I remoted into the system for testing (Cheese/George had it), and he said it took 3-4 cold reboots for it to come up, and suspected memory wasn't training correctly. So I did all the testing without ever rebooting the system, because it might not come back up if I tried.
Memory controllers are the biggest bottleneck (ha) to performant systems these days. The cores themselves are fine, but the memory controllers are slow and buggy.
I wonder if Zhaoxin's VIA heritage is helping them or holding them back - because of the patents, they were the only ones allowed to try, but since x86_64 and SSE2 are both now more than 20 years old, most of the patents don't matter any more (and AVX is not far from the cutoff).
The breakaway ARM China or SpacemiT or Loongson could drop in an x86_64 frontend and might get better results.
I think it should. Chinese language does not have the concept of word in writing, there is no space, and each character is a unit in writing. Pinyin was to mark pronunciations of each characters, it would be Lu Jia Zui, meaning 3 characters. Pinyin is not English. LuJiaZui would make it easier to pronounce if you don't know Pinyin. There is no standard way to write it though, maybe Lu-Jia-Zui, Lu'Jia'Zui, or Lu'jia'zui. The most confusing and would lead to wrong pronunciation version is Lujiazui.
What's the deal with the municipal government being a partner in this project? Is that structure common in china? Is it just them giving VIA tax breaks and things, or are they more involved than that?
Seems like it depends on the price point. These chips might be slow by modern standards, but if they're cheap enough then it doesn't really matter for a lot of the potential applications. I'm typing this post on a chip that is roughly in that performance bracket (an i5-3750k) that only rarely feels like the bottleneck. And this is my gaming machine.
Yeah it is pretty common. Governments invest in key area corporations to provide fund, tax breaks, regulatory aid and a bunch of other benefits, and sometimes sell its chunk of shares in a few years.
One early example is Chongqing government with Huang Qifan as mayor back in the 2010s.
Do governments allow some of their employees to be highly compensated relative to others? Would someone with real expertise in chip development work for the government at what the government is willing to pay? I think the answer is no.
The Chinese government has definitely "bought back" some top talent from the US. It's probably a small number of people.
I'm not sure why local governments would get involved although in general China has had a problem with too much investment and not enough places for it to go. It's not impossible that there are essentially local sovereign wealth funds.
This initiative seems to be a private company propped up by government funds rather than direct government employment. Think Lockheed Martin not DARPA.
This is interesting! Does anyone know how China’s reliance on chips from intel and amd is in the non-AI space (so regular consumer and server loads)? I’m wondering how it was 10 and 5 years ago, now, and how we predict in the next couple of years. Surely if they’re not mostly using their own chips they will very soon right?
They push local chips for independence, but unless the west embargoes them, I don't think you will see a major leap forward.
Huawei is a whole different beast though. They have a everything from the chip design up, and by now also an operating system that has arguably both a better frontend framework and a better kernel that the Linux alternatives. When we talk about Chinese AI chips being slow we specifically talk about classic desktop chips.
Also! For normal desktop work a 2011 intel chip is plenty fast. A lot of critical systems like train booking systems are keyboard focused ancient UI systems, and they seem fine.
Ignoring Fabs for now which is a different sets of issues.
They have JV with ARM ( ARM China ) and AMD ( Zen 1 ), IMG ( PowerVR and MIPS ) Along with investment on RISC-V. Alibaba and Huawei are all investing into RISC-V as well. Considering they dont sell CPU I wont be surprised if China one day give away RISC-V CPU design for free.
Surprisingly ARM China issue is still somewhat unresolved and ARM now has a separate subsidiary inside China.
There is to my mind a sort of race to get up to "fast enough to host H100 competitor AI hardware" with non-US IP that makes sense to engage in. In those terms, it looks like they're maybe 2 revs away -- I'm not sure what process node the KX7000 is on, but there's some architectural work to finish up. That said, this is interesting. I assume the chips will continue to improve from Zhaoxin, unless they lose their core team.
I think if I were in charge I’d probably do a final architecture spin at 16, and then shrink that one to 7 or 5 if I could get it.
The behavior of the memory controller is wild to see in this day and age. You really don't want to see latency that high in general, but especially not for a client processor. I'd really like to see how it behaves with a reasonably powerful GPU in a CPU-bound gaming workload relative to the competition (to simulate what one of these might see in an internet café setting, for instance).
Power efficiency also seems truly dismal according to PCWatch: https://pc.watch.impress.co.jp/docs/column/hothot/1626253.ht... . In Cinebench MT, it's consuming about the same power as a Ryzen 5 5600G while delivering about 1/3 the performance, and the idle power is much higher than the Core i3-8100/R5 5600G to boot. That's not a huge issue for desktops, but it would not make a good foundation for a mobile system.
Overall an improvement versus past Zhaoxin efforts but people shouldn't kid themselves about the quality of the overall package here. There is a long way to go.
Interestingly, the chip is rated to run at DDR4-3200 or DDR5, so it's strange C&C got half that.
The power issues are likely from by modern standards pre-historical clocking behavior (single P-state to my understanding)!
I remoted into the system for testing (Cheese/George had it), and he said it took 3-4 cold reboots for it to come up, and suspected memory wasn't training correctly. So I did all the testing without ever rebooting the system, because it might not come back up if I tried.
The breakaway ARM China or SpacemiT or Loongson could drop in an x86_64 frontend and might get better results.
Basically an attempt to bootstrap an industry brute force style
One early example is Chongqing government with Huang Qifan as mayor back in the 2010s.
I'm not sure why local governments would get involved although in general China has had a problem with too much investment and not enough places for it to go. It's not impossible that there are essentially local sovereign wealth funds.
Huawei is a whole different beast though. They have a everything from the chip design up, and by now also an operating system that has arguably both a better frontend framework and a better kernel that the Linux alternatives. When we talk about Chinese AI chips being slow we specifically talk about classic desktop chips.
Also! For normal desktop work a 2011 intel chip is plenty fast. A lot of critical systems like train booking systems are keyboard focused ancient UI systems, and they seem fine.
They have JV with ARM ( ARM China ) and AMD ( Zen 1 ), IMG ( PowerVR and MIPS ) Along with investment on RISC-V. Alibaba and Huawei are all investing into RISC-V as well. Considering they dont sell CPU I wont be surprised if China one day give away RISC-V CPU design for free.
Surprisingly ARM China issue is still somewhat unresolved and ARM now has a separate subsidiary inside China.
Already done: https://www.cnx-software.com/2021/10/20/alibaba-open-source-...https://github.com/XUANTIE-RV
Dead Comment
Windows 11 is dog slow on corporate hardware. Linux, even with bloated KDE or Gnome, is much faster.