It would be interesting with a graph of "performance per watt".
Mobile phones often have background tasks that does not need much CPU power. A53 seems very suitable for this, so it would be nice with some idea of how much power phones saves by using A53 for this instead of a high performance core.
Geekerwan on YouTube tests the performance and effiency of different cores at different frequencies. This is one good example where the A55 and A510 (successors to the A53) are graphed at around 15:40: https://youtu.be/s0ukXDnWlTY (honestly, the whole video is pretty informative)
It really depends on the workload. Modern thoughts on the matter trend towards a design called "race to sleep" where even your efficiency cores are still pretty beefy relative to an A53. That model is focused on quickly turning the core on, running through all of the work, and going back to sleep, rather than staying on longer on a much weaker core. Doing this effectively requires OS support to coalesce timer deadlines to bunch up as much work as possible on each wakeup cycle.
But with software support the model is very effective which is why you see most e-cores these days being relatively beefy OoOE cores that can leave the A53 in the dust. Whether that's Icestorm in the M1, Goldmont on Intel, or A57s on big.LITTLE ARM SoCs.
Pretty wild to me that the Cortex A53 is a decade old & barely modified in it's newer forms. It's so hard to imagine so many years & a microarchitecture remaining so unchanging.
By compare Intel's been making small but OK revs to the Atom cores. And the new E-core on the new n100 replacement is monstrously faster, yet still small. A potential core-m moment for Intel, a great small chip that expands.
>It's so hard to imagine so many years & a microarchitecture remaining so unchanging.
It's a shame, because it was the best design from ARM; they're now focusing on Cortex-A7x and Cortex-X, which aren't anywhere as power efficient[0].
Meanwhile, their revised Cortex-A57 has been surpassed in performance/power/area by several RISC-V microarchitectures, such as SiFive's U74[1], used in the VisionFive2 and Star64, or even the open source XuanTie C910[2][3].
I still use original Raspberry 2's (A53) in my server clusters, they are the lowest power devices that can saturate (good) SD cards on random writes/reads and still have performance left to serve calculations.
A72 in Raspberry 4 is really the pinnacle low power CPU performance if you count $ and compatibility (linux <-> GPU). ~3x can saturate symmetric 1Gb/s with advanced computation and still have calculations left.
You can buy neither, we'll see if the 4 ever comes back. The only thing I'm 100% of is that the 5 will have some drawback compared to both the 2 and the 4.
I can concur the peak of A55; the 5W TDP 22nm RK3566 in my RG353M is mind boggling! HL1 at 60 FPS and soon HL 2 at 30 FPS.
But Risc-V is not progressing with linux, mainly GPU drivers and integration is the problem. Historically the Chinese boards never got any attention and unfortunately you cannot depend on attention happening this time either.
Partly because now companies across the board(ers) are hiding kernel configs again. And "board support packages" are slow to be mainlined if ever.
It suggest its applications are fundamentally not power or performance sensitive, and care only about cost. It’s the lagging edge of microarchitecture, where any improvement whatsoever is uninteresting because it would cost more than zero to develop.
> "The Performance P670 and P470 RISC-V cores are designed to take on ARM’s Cortex A53 and A55 cores, Drew Barbier, senior director of product management at SiFive tells eeNews Europe."
A compare-and-contrast article would make for good reading.
What I see there is that SiFive is, at the very least, five years behind the mid range of ARM CPUs, not only in performance, but also regarding the toolchain.
At a similar cost, what’s the real advantage of migrating the current catalog of wearables or IoT products, to RISC-V? There’s a proven and tested platform, widely used in the industry, and the alternative is still trying to catch up.
It’s easier to understand if you look at an analysis of a much older, less dense, die. Ken Shiriff is a superb source of these
There are some visual clues. First, the chip pins are labeled in the spec so you can guess they’ll be close to relevant units, and also try to trace their connections throughout the die.
Second, units like memory have an obvious regular structure because they are made from many identical micro-units
Third, if you see, for example, 16 identical adjacent units you could guess this is something that could be dealing with data 16 bits at a time. That narrows it down
There are numerous clues like those.
You could also use tricks like using a thermal camera. What part gets hot when you do certain operations?
One of my professional regrets was when I worked at Intel in 1997, I had 30x42 plots of chip dies hanging on the wall in our lab... I wish I had taken some of them and framed them, they were beautiful.
+1 on Ken Shiriff’s blog, and his work with @curiousmarc on YouTube. Those gentlemen are national treasures whose work on restoring, documenting, and appreciating vintage computing and Apollo-era technology have been second to none in their breadth and depth.
I also personally really value their work—for anyone with intermediate to advanced knowledge of electronics engineering and computers, they are an invaluable source of educational entertainment as traditional mainstream media simply doesn’t cater to such niche audiences.
SoCs that have only A53 cores are terribly slow. Recently played with a Motorola Moto G22 phone with a Mediatek Helio G37. The phone has a nice design, 5 cameras, enough RAM and storage (4GB/64GB), but the UI is laggy and slow, installing and lunching apps, rendering web pages takes a lot of time.
This core is ideal to be replaced in low power platforms like SMB routers that run on MIPS (MT7621). I think Qualcomm and Mediatek is extensively using these in router SOCs which previously were based on MIPS. These cores are probably less 'application' cores and a more designed towards low power helping cores. For example QC uses network accelerators along with the cores. Anything gigabit and beyond still requires bigger ARM cores or Intel but below gigabit these are not bad.
Not trying to justify phone manufacturers not putting in the effort to optimize their software, but one way around the slow UIs is to go into the Developer Settings and turn the UI animation speed to 0x. It's a setting I've always enabled on my Android phones when I used to use them.
It's sad that Android doesn't automatically fall back to a simpler GUI which takes less time to render. Even Windows XP (2000, 98?) got this right (with manual settings).
Even an A53 is a super computer when it comes to graphics compared to CPUs of yore.
A serious improvement on these budget phones is turning animations completely off, it removed most of the stuttering and GPU usage doesn't spike just pulling up the keyboard.
Android doesn't automatically fall back to a simpler GUI
How much simpler can it be, given that everything seems to already be flat and borderless? As your last sentence alludes to, Windows and other desktop OSs worked perfectly fine with far more complex UIs (including windows) on far less powerful hardware. Mobile UIs seem to be quite primitive in comparison.
In other words, this is entirely a software problem.
A visually simpler GUI (such as Luna vs. "classic" on Windows XP) isn't necessarily less resource-intensive. Implementing an actual different, less resource-intensive rendering path could help, but would double the development effort.
Nope... I'm using a tiny SBC, a Radxa Zero with a A53 Cpu, with Manjaro Linux, as a ultra low power daily driver and it is perfectly usable for light browsing, programming or productivity.
It boots Linux in 7 seconds and xfce desktop is pretty snappy.
Kernel is 6.1 and RAM is only 4GB.
Opens Lazarus almost instantly and FPC compiles ARM binaries super fast.
Agreed. I'm perfectly happy to have a couple of A53s in my phone for background tasks. Four feels a bit overkill but okay, maybe it makes the big.LITTLE design work better.
But I've always been disappointed by devices that are all A53s.
And when I see devices that have eight A53s and nothing else, I have to assume that they are just trying to trick people into thinking it's a more powerful device than it actually is.
>I have to assume that they are just trying to trick people into thinking it's a more powerful device than it actually is
Why would you think that people who actually look up and care about the hardware at the same time are unable to read the first sentence on wikipedia and have no idea what it is? Do you really believe that customers of $100 budget phones are tricked into powerful performance?
I would not call the Redmi Note 5 (SD626) I've been using for the last 4 years "terribly slow", instead I call it "perfectly useable". This whole "laggy UI" thing people complain about is beyond me, the UI is GPU rendered and keeps up with most of what I throw at it. I don't expect a low-power device I charge every 3d day to perform like a mains-connected system.
> I would not call the Redmi Note 5... "terribly slow", instead I call it "perfectly useable".
The software you use plays a rather large role in how the hardware performs. Some people here like to live on the OEM-designed happy path, where things tend to just work. That means using Google Apps for everything, an expectation that the latest video streaming social platforms will open quickly and not stutter, and scrolling the Google Play Store or Google Maps will be a fluid experience.
Others may use simpler apps, or expect less of their phones. I'm in the latter category, and I suspect you are as well. While the BlackBerry KeyOne I use daily was panned by some six months after release in 2017 for being too slow, I instead killed off nearly everything else that would run in the background - including and specifically any Google frameworks and apps.
Some software companies have made a point of taking any hardware gains for granted. Most people have new phones, with fast processors, so some companies will push devs to take shortcuts. I'm quietly indignant about that, though that rant is rather tangental to your original question about how some have such different experiences from yours.
You may not expect desktop-class performance, though others do. Display scrolling on a mobile handset is an indicator of quality that separates cheap devices from those that one might actually want to use to get work (or play) done.
They are meant to be used in a big.LITTLE configuration. So the A53 cores should be active in the low-power mode, and more powerful cores should be active in high-power mode.
I'll bet that's due to slow storage, and not the CPU(s); I've done a few handsets and tablets and write performance was a large part of being laggy or not. It's quite obvious when the storage is full and the flash controller spends a lot of time doing RMW ops and halting writes.
Yep, this was also the case with my old phone. Opening apps took a while but after that, everything was more fluid afterwards and clearly indicated that storage played a part in the device's slowness. Though, the 1.5 GB ram and the quad-core Cortex-A7 still made the device pretty slow.
Don't think so - CPU and GPU are far more important for the speed and fluidity of UI than flash write speed.
Yes, if the storage is full it can kill both the performance and stability of Android, but devices with slow SoC are slow even with plenty of free space.
In regard to the A55 and the A510, can anyone explain the design goals of these? Do they refine the A53 as a "small" CPU? Or are they larger more featureful CPUs?
The main purpose of Cortex-A55 and Cortex-A510 is to implement additional instructions over those of Cortex-A53, respectively Armv8.2-A and Armv9.0-A.
This is necessary to make them ISA-compatible with the big cores and medium-size cores with which they are intended to be paired.
Besides the main goal of implementing improved ISA's, they take advantage of the fact that since the time of Cortex-A53 the cost of transistors has diminished a lot and they implement various micro-architectural enhancements that result in a decently greater performance at identical clock frequency, while keeping similar area and power consumption ratios between the small cores like A510 and the medium-size cores like A710, like they are since the first Big.little ARM cores (Cortex-A15 paired with Cortex-A7).
ARM has always avoided to publish any precise numbers for the design goals of the little cores, but it seems that they are usually designed to use an area of about 25% of the area of the medium-size cores and to have a power consumption around 0.5 W per core.
n my impression, the later ones are all supposed to be successors of the previous but within about the same chip area. The A55 is basically a refined A53 with support for DynamIQ. The point of the A510 was support for ARMv9 with SVE2, but it is wider also because people expect faster processors. To amortise the cost of the larger back-end it lost 32-bit support and there's an option to make a cluster of two share the same FP/SIMD/SVE unit and L2 cache.
The A53 is a fantastic workhorse for all kinds of embedded workloads. I was worried bloat would creep into later models, while the tried and true A53 moves toward obsolescence. From what you're saying, it seems like they are trying not to get carried away with it.
"Efficient cores for low power tasks and performance cores for demanding applications" is a catchphrase I've seen hundreds of times but I've never once seen someone actually demonstrate it or test it, or even really explain how my phone decides which is which. Does WhatsApp run on an efficiency core most of the time but swap to a performance core when it's converting a video to send?
https://eclecticlight.co/ Has multiple articles characterising the M1 (and M2) and how the MacOS scheduler uses it.
I’m sure Android’s scheduled does things differently but it’s at least an idea of the sort of things which can happen.
For macs (and I assume iOS) the basics are that background processes get scheduled on E cores exclusively, and higher priority processes get scheduled on P cores preferentially but may be scheduled on E cores if the P cores are at full occupancy.
and it uses cgroups too. Process.THREAD_GROUP_* has some Android ones, but different vendors sometimes write their own to try and be clever to increase performance.
It's also worth bearing in mind that there's been a lot of work put into that scheduler over the years so it will make better decisions about what to run where when the cores aren't all the same.
"Generally, when the game is in the foreground, persistent threads such as the game thread and render thread should run on the high-performance large cores, whereas other process and worker threads may be scheduled on smaller cores."
There's also a Wikipedia article [2] which talks a little about scheduling. I imagine Android probably has more specific context it can use as hints to its scheduler about where a thread should be run.
It is very popular only because almost all companies that are neither Chinese nor smartphone-oriented have failed to introduce any products with less obsolete ARM cores, for many, many years.
NXP has begun to introduce products with Cortex-A55 only recently, but they should always be preferred for any new designs over the legacy products with Cortex-A53, because the Armv8.2-A ISA implemented by Cortex-A55 corrects some serious mistakes of the Cortex-A53 ISA, e.g. the lack of atomic read-modify-write memory accesses.
The people who still choose Cortex-A53 for any new projects are typically clueless about the software implications of their choice.
Unfortunately, there are only 3 companies that offer CPUs for automotive and embedded applications with non-obsolete ARM cores: NVIDIA, Qualcomm and MediaTek. All 3 demand an arm and a leg for their CPUs, so whenever the performance of a Cortex-A55 is not enough it is much cheaper to use Intel Atom CPUs than to use more recent ARM cores, e.g. Cortex-A78.
This is a little harsh, upgrade cycles in these fields are very long. I'm working on a project right now where we are upgrading from iMX6 (quad A9) to iMX8 (quad A53).
Mobile phones often have background tasks that does not need much CPU power. A53 seems very suitable for this, so it would be nice with some idea of how much power phones saves by using A53 for this instead of a high performance core.
But with software support the model is very effective which is why you see most e-cores these days being relatively beefy OoOE cores that can leave the A53 in the dust. Whether that's Icestorm in the M1, Goldmont on Intel, or A57s on big.LITTLE ARM SoCs.
By compare Intel's been making small but OK revs to the Atom cores. And the new E-core on the new n100 replacement is monstrously faster, yet still small. A potential core-m moment for Intel, a great small chip that expands.
It's a shame, because it was the best design from ARM; they're now focusing on Cortex-A7x and Cortex-X, which aren't anywhere as power efficient[0].
Meanwhile, their revised Cortex-A57 has been surpassed in performance/power/area by several RISC-V microarchitectures, such as SiFive's U74[1], used in the VisionFive2 and Star64, or even the open source XuanTie C910[2][3].
0. https://youtu.be/s0ukXDnWlTY?t=790
1. https://www.sifive.com/cores/u74
2. https://xrvm.com/cpu-details?id=4056743610438262784
3. https://github.com/T-head-Semi/openc910
A72 in Raspberry 4 is really the pinnacle low power CPU performance if you count $ and compatibility (linux <-> GPU). ~3x can saturate symmetric 1Gb/s with advanced computation and still have calculations left.
You can buy neither, we'll see if the 4 ever comes back. The only thing I'm 100% of is that the 5 will have some drawback compared to both the 2 and the 4.
I can concur the peak of A55; the 5W TDP 22nm RK3566 in my RG353M is mind boggling! HL1 at 60 FPS and soon HL 2 at 30 FPS.
But Risc-V is not progressing with linux, mainly GPU drivers and integration is the problem. Historically the Chinese boards never got any attention and unfortunately you cannot depend on attention happening this time either.
Partly because now companies across the board(ers) are hiding kernel configs again. And "board support packages" are slow to be mainlined if ever.
https://www.eenewseurope.com/en/sifive-aims-for-arm-with-hig...
> "The Performance P670 and P470 RISC-V cores are designed to take on ARM’s Cortex A53 and A55 cores, Drew Barbier, senior director of product management at SiFive tells eeNews Europe."
A compare-and-contrast article would make for good reading.
At a similar cost, what’s the real advantage of migrating the current catalog of wearables or IoT products, to RISC-V? There’s a proven and tested platform, widely used in the industry, and the alternative is still trying to catch up.
There are some visual clues. First, the chip pins are labeled in the spec so you can guess they’ll be close to relevant units, and also try to trace their connections throughout the die.
Second, units like memory have an obvious regular structure because they are made from many identical micro-units
Third, if you see, for example, 16 identical adjacent units you could guess this is something that could be dealing with data 16 bits at a time. That narrows it down
There are numerous clues like those.
You could also use tricks like using a thermal camera. What part gets hot when you do certain operations?
I also personally really value their work—for anyone with intermediate to advanced knowledge of electronics engineering and computers, they are an invaluable source of educational entertainment as traditional mainstream media simply doesn’t cater to such niche audiences.
Deleted Comment
Even an A53 is a super computer when it comes to graphics compared to CPUs of yore.
How much simpler can it be, given that everything seems to already be flat and borderless? As your last sentence alludes to, Windows and other desktop OSs worked perfectly fine with far more complex UIs (including windows) on far less powerful hardware. Mobile UIs seem to be quite primitive in comparison.
In other words, this is entirely a software problem.
It boots Linux in 7 seconds and xfce desktop is pretty snappy.
Kernel is 6.1 and RAM is only 4GB.
Opens Lazarus almost instantly and FPC compiles ARM binaries super fast.
Amazing little machine...
But I've always been disappointed by devices that are all A53s.
And when I see devices that have eight A53s and nothing else, I have to assume that they are just trying to trick people into thinking it's a more powerful device than it actually is.
Why would you think that people who actually look up and care about the hardware at the same time are unable to read the first sentence on wikipedia and have no idea what it is? Do you really believe that customers of $100 budget phones are tricked into powerful performance?
The software you use plays a rather large role in how the hardware performs. Some people here like to live on the OEM-designed happy path, where things tend to just work. That means using Google Apps for everything, an expectation that the latest video streaming social platforms will open quickly and not stutter, and scrolling the Google Play Store or Google Maps will be a fluid experience.
Others may use simpler apps, or expect less of their phones. I'm in the latter category, and I suspect you are as well. While the BlackBerry KeyOne I use daily was panned by some six months after release in 2017 for being too slow, I instead killed off nearly everything else that would run in the background - including and specifically any Google frameworks and apps.
Some software companies have made a point of taking any hardware gains for granted. Most people have new phones, with fast processors, so some companies will push devs to take shortcuts. I'm quietly indignant about that, though that rant is rather tangental to your original question about how some have such different experiences from yours.
Yes, if the storage is full it can kill both the performance and stability of Android, but devices with slow SoC are slow even with plenty of free space.
This is necessary to make them ISA-compatible with the big cores and medium-size cores with which they are intended to be paired.
Besides the main goal of implementing improved ISA's, they take advantage of the fact that since the time of Cortex-A53 the cost of transistors has diminished a lot and they implement various micro-architectural enhancements that result in a decently greater performance at identical clock frequency, while keeping similar area and power consumption ratios between the small cores like A510 and the medium-size cores like A710, like they are since the first Big.little ARM cores (Cortex-A15 paired with Cortex-A7).
ARM has always avoided to publish any precise numbers for the design goals of the little cores, but it seems that they are usually designed to use an area of about 25% of the area of the medium-size cores and to have a power consumption around 0.5 W per core.
News: <https://www.anandtech.com/show/18871/arm-unveils-armv92-mobi...>
Discussion: <https://news.ycombinator.com/item?id=36109916>
Deleted Comment
I’m sure Android’s scheduled does things differently but it’s at least an idea of the sort of things which can happen.
For macs (and I assume iOS) the basics are that background processes get scheduled on E cores exclusively, and higher priority processes get scheduled on P cores preferentially but may be scheduled on E cores if the P cores are at full occupancy.
https://youtu.be/s0ukXDnWlTY
Its really quite informative.
https://www.kernel.org/doc/html/latest/scheduler/sched-capac...
Android does set niceness for processes. See the THREAD_PRIORITY_* constants
https://developer.android.com/reference/android/os/Process
and it uses cgroups too. Process.THREAD_GROUP_* has some Android ones, but different vendors sometimes write their own to try and be clever to increase performance.
https://source.android.com/docs/core/perf/cgroups
"Generally, when the game is in the foreground, persistent threads such as the game thread and render thread should run on the high-performance large cores, whereas other process and worker threads may be scheduled on smaller cores."
There's also a Wikipedia article [2] which talks a little about scheduling. I imagine Android probably has more specific context it can use as hints to its scheduler about where a thread should be run.
[1] https://developer.android.com/agi/sys-trace/threads-scheduli...
[2] https://en.wikipedia.org/wiki/ARM_big.LITTLE#Scheduling
NXP has begun to introduce products with Cortex-A55 only recently, but they should always be preferred for any new designs over the legacy products with Cortex-A53, because the Armv8.2-A ISA implemented by Cortex-A55 corrects some serious mistakes of the Cortex-A53 ISA, e.g. the lack of atomic read-modify-write memory accesses.
The people who still choose Cortex-A53 for any new projects are typically clueless about the software implications of their choice.
Unfortunately, there are only 3 companies that offer CPUs for automotive and embedded applications with non-obsolete ARM cores: NVIDIA, Qualcomm and MediaTek. All 3 demand an arm and a leg for their CPUs, so whenever the performance of a Cortex-A55 is not enough it is much cheaper to use Intel Atom CPUs than to use more recent ARM cores, e.g. Cortex-A78.