> Through some digging, I found that when a desktop enters S3 sleep, the system cuts power to PCIe GPUs
I am not sure how correct this assumption is. S3 is supposed to cut power to everything but RAM, but for example Gigabyte Aorus motherboards are notorious for an NVMe SSD sleep bug that randomly prevents the system from properly sleeping or waking.
This is fixed by adding the following udev rule:
# Generic PCIe fix for sleep bugs by preventing wakeup from any PCIe port
ACTION=="offline", SUBSYSTEM=="pci", DRIVER=="pcieport", ATTR{power/wakeup}="disabled"
or more targeted:
# Gigabyte sleep fix by preventing wakeup from problematic PCIe port, depends on motherboard model
ACTION=="offline", SUBSYSTEM=="pci", ATTR{vendor}=="0x8086", ATTR{device}=="0x43bc", ATTR{power/wakeup}="disabled"
You can find any glitched PCIe wakeup device with:
1. cat /proc/acpi/wakeup (you'll have to trial and error your way through the wakeup devices if it isn't immediately clear)
2. cat /sys/class/pci_bus/*/*/yourWakeupDevicePci/uevent | grep PCI_ID
3. prepend "0x"
You also have the option of:
udevadm info --attribute-walk /dev/whatever
but for that you need to know some basic identifier of your glitchy device.
Or if you want to shellscript it (less reliable than letting udev do it for you and needs to be done via systemd service file or another automation):
# Gigabyte sleep fix, port depends on mobo model
/bin/bash -c 'if grep 'RP05' /proc/acpi/wakeup | grep -q 'enabled'; then echo 'RP05' > /proc/acpi/wakeup; fi'";
Yes I really hate this (and other) Linux sleep issues.
And my Logitech Bolt receiver wakes multiple of my Linux computers instantly, I don't know why it doesn't do that on Windows and haven't tried doing a USB capture (and don't know what equipment I'd need to try it out, logic analyzer? Glasgow?). In the meantime I've added a rule to block that:
KERNELS=="0000:00:01.1" sounds like an interesting way to do it, since you can target separate functions of the PCI device (in this case: domain 0, bus 0, slot 1, function 1).
As somebody with an Aorus motherboard who has probably burned a few kWh on this issue, I was really excited to try these solutions - no luck. Thank you anyway!
Did you try the general fix? And reload udev rules?
You also have to make sure it applies after the default rules.
You can check if the rule applies once you have everything set up by doing an `udevadm` attribute walk of your SSD device (not partition), and then following it up all the way up the device tree until you see your specific device port (target fix) or PCIe driver subsystem (general fix). Then check if "power/wakeup" is set to "disabled". If it is set to disabled, something else is keeping your device awake on sleep.
For that you can check /proc/acpi/wakeup, and there's also a specific systemd invocation (that I forgot) you can do that shows if your device slept, how long it slept, how much battery was drained, and if your device woke-up, slept or failed to resume, it'll give you a reason.. to the best of its ability.
Wow, thanks for this tip! I've been dealing with suspend issues with an X570 Aorus Master as well.
Running `echo GPP0 >> /proc/acpi/wakeup` into a systemd unit at boot solved the issue for me... except the first sleep after a boot would always wake back up immediately.
I applied your udev rule and that issue seems to be resolved as well!
> I am not sure how correct this assumption is. S3 is supposed to cut power to everything but RAM, but for example Gigabyte Aorus motherboards are notorious for an NVMe SSD sleep bug that randomly prevents the system from properly sleeping or waking.
You would hope that you could probe the hardware to see if it really is in sleep or not, or that re-waking the hardware would not cause issue if it never went to sleep.
Also I would expect that you could send a sleep command to the PCIe device, then try to sleep the bus itself. The to wake you would bring back the bus and then wake the device.
Any further insight you might have on these Aorus wakeup issues? In particular, it seems the wakeup in my case is coming from `.../devices/LNXSYSTM:00/LNXSYBUS:00/PNP0A08:00/device:45/wakeup/wakeup6` which does not really mean anything to me.
I had this issue and this MB for years I eventually solved it by physically removing a crappy USB C PCIE card I'd bought because my case didn't have any USBC ports.
(Additionally I also previously disabled PCI wakeup buses and haven't touched it again since it's working)
Author of memreserver (one of the mentioned userspace workarounds) here. I've debugged this a few years back, only public comment I can quickly find is [1]. I also remember some mailing list discussions, but it basically came down to the isuse that Linux didn't have staggered suspend hooks that reliably ran before disks and parts of the memory subsystem were frozen. Apparently this is now possible. Sadly the Freedesktop Gitlab doesn't seem indexable so this knowledge seems to have gotten lost.
This is amazing work! If folks have ever wondered why suspend is so difficult to get working on linux and why debugging it is equally difficult, this is a single datapoint with lots of information about all the things that can go wrong. Even now I have a thinkpad P1G4 where the fans won't turn off automatically unless I turn them off before going into suspend. Recently I also started having crackling issues with my bluetooth headphones after resuming from suspend and had to disable node suspension there also (https://wiki.archlinux.org/title/PipeWire#Noticeable_audio_d...).
Remarkable that it's 2025 and laptop sleep/suspend still doesn't work right on linux. I think the first time I encountered this was probably 15 years ago now?
Power control is the kind of stuff that benefits from very tight integration, and PCs just don’t have that.
Firmware is seen by most vendors as a pure cost to minimize, so you get a fragmented market full of subcontractors delivering the bare minimum that is considered “working”. Manufacturers also know most people aren’t going to use a big part of the functions they’re supposed to provide to OSes, and no one is really checking them, so it’s very common for devices to have only partial support for things they supposedly do.
This space is problematic enough that you could reliably segfault 2017-2019 Intel MacBooks by closing the lid before unplugging HID peripherals, preventing suspend (and cooking it in your bag on the commute home).
It also plagues Windows on custom PC builds, even when there are vendor drivers. Not every component plays nicely with suspend states, ASPM, C-sates, load line calibration, etc. And while often the capability exists natively to address issues (in BIOS, Linux, etc), how many people know how to start looking?
After my (closed) gaming laptop started making annoying Windows noises earlier today, I'm led to believe that it doesn't work properly on Windows either.
It seems like it's basically hardware whack-a-mole at this point. The only reason Apple does it reasonably well is they control more of the stack and they support less hardware. The only reason Windows does it better than Linux is they have more eyes on it.
It doesn't work right on Windows either to be fair. With a mixed laptop fleet at work, we've just disabled sleep/hibernate company wide because it causes way too many problems.
If you have a modern machine with S0 sleep, which is "modern standby" it's very much solved. What it does is it pauses all userspace processes, disables all cores but one and keeps it running on the lowest frequency. The system stays "on" but all devices go in power-saving state which is good enough for days.
So it's not really a problem unless you really wanna do deeper sleeps.
Recent Windows laptops have even more issues. My wife literally never suspends her Windows laptop for this reason. Meanwhile my Intel/Nvidia laptop running Debian works flawlessly (albeit with Nouveau drivers, gave up on the proprietary ones for reasons unrelated to suspend).
It can be really hit-or-miss, and it can be really hard to debug errors like in the post.
A lot of workarounds that are suggested for various issues are also not really viable. Some of the workarounds involve turning off different power-saving modes; however, the point of enabling sleep is often to increase the amount of usable time between charges, and turning off these power-saving modes can often dramatically shorten battery life.
But getting sleep to work (even S0ix!) is not impossible.
I have a bunch of handheld AMD 7840U and AMD 8840U devices that I have installed Arch Linux on: GPD Win Max 2, GPD Win Mini, GPD Win 4, Minisforum V3, OneXPlayer X1 Ryzen. These devices were not designed with Linux support in mind. I would be very surprised if the companies that made them ever tested them with Linux. Yet with just a small amount of work (generally fiddling with `/proc/acpi/wakeup` and `/sys/devices/*/*/*/power/wakeup` to disable sources of spurious wakeups,) I have gotten essentially flawless S0ix support (… on all but the newest OneXPlayer X1 Ryzen.)
(In general, out-of-the-box stock Linux kernel support on these devices is fantastic. Touchscreens work, pen input works, wifi and Bluetooth work well. The only gap I've seen is fingerprint reader support.)
I suspect that given how small these manufacturers are (and how small their production batches must be,) there's much less extreme-customization and tight-integration of components. This is visibly evident in the form-factors of these devices, which many millimeters thicker than they might otherwise be. (Of course, these devices are primarily advertised to a gaming audience who are eager to avoid the thermal-throttling that happens with ultra-thin devices like Surface Pro…) I partially suspect that the lack of extreme-customization, the lack of tight-integration, and the smaller production batches means that the manufacturers make much more conservative choices in components. Maybe this explains the exceptional Linux support?
To add: for the end user, the way to easily get working suspend is to buy known-good compatible hardware.
It's been solid on every business Thinkpad for a long long time for me and consistently seems people on Windows with the same models have more sleep problems.
My sincere personal thanks for this. My main laptop is a Ryzen-based ThinkPad running Linux that I suspend and hibernate regularly, and I sporadically encounter this issue. Looking forward to 6.14!
Why was dm->cached_state storing -12 instead of a pointer? Most likely this happened because earlier during suspend, dm_suspend() assigned dm.cached_state = drm_atomic_helper_suspend(adev_to_drm(adev)). The callee drm_atomic_helper_suspend() could return either a valid pointer, or ERR_PTR(err) which encoded errors as negative pointers. But the caller function assigned the return value directly to a pointer which gets dereferenced upon resume, instead of testing the return value for an error.
One more point for rust in the kernel. Just can't happen if you're required to handle a Result type.
But of course defaults matter and the kernel’s rich history of not modernizing coding practices is going to work against improvements in C land. Ironically, it’s that same resistance that frustrates the Rust devs so much because their resistant to even cleaning up their own subsystems or putting down markers documenting how the subsystems are supposed to work.
Maybe https://github.com/llvm/llvm-project/issues/74205 would help once it trickles down into the kernel, but I suspect that people are still going to choose to do this manual overloading of the pointer instead of using types for safety.
Your work will help me on a Framework AMD laptop with the GPU extension and dual boot Linux/Windows. May I donate to you or to your favorite charity? My contact info is in my profile.
I used to think that naming things, cache invalidation, and off-by-one errors were the 2 biggest problems in CS, but then I learned about the "sleep/wake" problem and realized it's NP-complete.
With how much trouble I had with trying (and failing) to make my brand new Dell laptop sleep properly and not the "Modern Standby" crap, plus my desktop randomly breaking GPU hardware acceleration in browser after waking up, I would say it's around O(n4) now. Or maybe even O(n!).
Memory management and specifically OOM conditions remain an unbelievably painful nightmare on Linux. It's not like I run into these issues constantly, but I've definitely tried to debug issues like these (unsuccessfully). Ultimately if I OOM a machine I usually wind up installing more RAM, which is wasteful/expensive, but it's pretty clear that handling OOM conditions gracefully is going to be a hard problem for Linux to solve into the future.
This is really great work and will serve as a reference point for debugging similar issues in the future. Pretty happy about systemd's debug-shell feature, I had no idea that existed. I don't think my X670E Steel Legend board has a serial header anywhere on it, though. How do modern built-in serial ports work, anyway? Are they attached off of the chipset PCIe lanes?
Something that's also very useful when trying to dive into the Linux kernel is that there's a bunch of great talks discussing Linux kernel subsystems from conferences like FOSDEM and Linux Plumber's Conference which you can usually find recordings of online. For example, there's this one for TTM, the memory subsystem that most of the desktop GPU DRM drivers use:
I’ve had good luck containing ooms with cgroups. I’m not sure if there is a state of the art for handling oom conditions beyond what Linux does. If anyone knows and can recommend some reading I would appreciate it.
- Overcommit. Linux will "overcommit" memory: allocations will succeed when there's no memory, and then hang when the page is actually mapped if no physical pages are available (to my understanding.) Windows NT doesn't do this. Not sure exactly how macOS/XNU handles it.
- The OOM killer. Because allocations don't fail, to actually recover from an OOM situation the kernel will enumerate processes and try to kill ones that are using a lot of memory, by scoring them using heuristics. The big problem? If there isn't a single process hogging the memory, this approach is likely to work very poorly. As an example, consider a highly parallel task like make -j32. An individual C++ compiler invocation is unlikely to use more than a gigabyte or two of memory, so it's more likely that things like Electron apps will get caught first. The thrashing of memory combined with the high CPU consumption of compilers that are not getting killed will grind the machine to a near-complete halt. If you are lucky, then it will finally pick a compiler to kill, and set off a chain reaction that ends your make invocation.
There are solutions... Indeed, you can use quotas with cgroups. There's tools like systemd-oomd that try to provide better userspace OOM killing using cgroups. You can disable overcommit, but some software will not function very well like this as they like to allocate a ton of pages ahead of time and potentially use them later. Overcommit fundamentally improves the ability to efficiently utilize all available memory. Ultimately I think overcommit is probably a bad idea... but it is hard to come up with a zero-compromises solution that keeps optimal memory/CPU utilization but avoids pathological OOM conditions by design.
> Memory management and specifically OOM conditions remain an unbelievably painful nightmare on Linux.
Yes. It's horrendous to put it mildly. Linux does not handle OOM conditions properly.
I know I can set up a few guardrails with cgroups. I know I can also install earlyoom. I know I can increase swap or use zram. In the end these are all fundamentally just nasty hacks that might spare one once in a while. They do not fix how these conditions are handled. Please do not offer these as solutions.
I've seen LUKS volumes mount themselves read-only because the kernel couldn't allocate memory in dm_crypt, for the love of god just kill something in userspace. The current state is utterly unacceptable and I'm tired of all the excuses.
With zstd you can turn 8GB of RAM into 20GB of 'RAM' without much issue. or 16GB into 40GB. Hell, if you're feeling adventurous (and Android does this, so its very stable) you can overcommit your memory past 100%.
I am not sure how correct this assumption is. S3 is supposed to cut power to everything but RAM, but for example Gigabyte Aorus motherboards are notorious for an NVMe SSD sleep bug that randomly prevents the system from properly sleeping or waking.
This is fixed by adding the following udev rule:
or more targeted: You can find any glitched PCIe wakeup device with: You also have the option of: but for that you need to know some basic identifier of your glitchy device.Or if you want to shellscript it (less reliable than letting udev do it for you and needs to be done via systemd service file or another automation):
Yes I really hate this (and other) Linux sleep issues.KERNELS=="0000:00:01.1" sounds like an interesting way to do it, since you can target separate functions of the PCI device (in this case: domain 0, bus 0, slot 1, function 1).
You also have to make sure it applies after the default rules.
You can check if the rule applies once you have everything set up by doing an `udevadm` attribute walk of your SSD device (not partition), and then following it up all the way up the device tree until you see your specific device port (target fix) or PCIe driver subsystem (general fix). Then check if "power/wakeup" is set to "disabled". If it is set to disabled, something else is keeping your device awake on sleep.
For that you can check /proc/acpi/wakeup, and there's also a specific systemd invocation (that I forgot) you can do that shows if your device slept, how long it slept, how much battery was drained, and if your device woke-up, slept or failed to resume, it'll give you a reason.. to the best of its ability.
Running `echo GPP0 >> /proc/acpi/wakeup` into a systemd unit at boot solved the issue for me... except the first sleep after a boot would always wake back up immediately.
I applied your udev rule and that issue seems to be resolved as well!
I remember there being some strange interaction with the wakeup behaviour being toggled otherwise. But this could be due to me being on NixOS.
You would hope that you could probe the hardware to see if it really is in sleep or not, or that re-waking the hardware would not cause issue if it never went to sleep.
Also I would expect that you could send a sleep command to the PCIe device, then try to sleep the bus itself. The to wake you would bring back the bus and then wake the device.
Any further insight you might have on these Aorus wakeup issues? In particular, it seems the wakeup in my case is coming from `.../devices/LNXSYSTM:00/LNXSYBUS:00/PNP0A08:00/device:45/wakeup/wakeup6` which does not really mean anything to me.
(Additionally I also previously disabled PCI wakeup buses and haven't touched it again since it's working)
hope that info helps.
[1] https://gitlab.freedesktop.org/drm/amd/-/issues/2125#note_17...
Power control is the kind of stuff that benefits from very tight integration, and PCs just don’t have that.
Firmware is seen by most vendors as a pure cost to minimize, so you get a fragmented market full of subcontractors delivering the bare minimum that is considered “working”. Manufacturers also know most people aren’t going to use a big part of the functions they’re supposed to provide to OSes, and no one is really checking them, so it’s very common for devices to have only partial support for things they supposedly do.
It also plagues Windows on custom PC builds, even when there are vendor drivers. Not every component plays nicely with suspend states, ASPM, C-sates, load line calibration, etc. And while often the capability exists natively to address issues (in BIOS, Linux, etc), how many people know how to start looking?
It seems like it's basically hardware whack-a-mole at this point. The only reason Apple does it reasonably well is they control more of the stack and they support less hardware. The only reason Windows does it better than Linux is they have more eyes on it.
So it's not really a problem unless you really wanna do deeper sleeps.
I have been buying System76 for about twenty years now, and this is has not been an issue for me either.
A lot of workarounds that are suggested for various issues are also not really viable. Some of the workarounds involve turning off different power-saving modes; however, the point of enabling sleep is often to increase the amount of usable time between charges, and turning off these power-saving modes can often dramatically shorten battery life.
But getting sleep to work (even S0ix!) is not impossible.
I have a bunch of handheld AMD 7840U and AMD 8840U devices that I have installed Arch Linux on: GPD Win Max 2, GPD Win Mini, GPD Win 4, Minisforum V3, OneXPlayer X1 Ryzen. These devices were not designed with Linux support in mind. I would be very surprised if the companies that made them ever tested them with Linux. Yet with just a small amount of work (generally fiddling with `/proc/acpi/wakeup` and `/sys/devices/*/*/*/power/wakeup` to disable sources of spurious wakeups,) I have gotten essentially flawless S0ix support (… on all but the newest OneXPlayer X1 Ryzen.)
(In general, out-of-the-box stock Linux kernel support on these devices is fantastic. Touchscreens work, pen input works, wifi and Bluetooth work well. The only gap I've seen is fingerprint reader support.)
I suspect that given how small these manufacturers are (and how small their production batches must be,) there's much less extreme-customization and tight-integration of components. This is visibly evident in the form-factors of these devices, which many millimeters thicker than they might otherwise be. (Of course, these devices are primarily advertised to a gaming audience who are eager to avoid the thermal-throttling that happens with ultra-thin devices like Surface Pro…) I partially suspect that the lack of extreme-customization, the lack of tight-integration, and the smaller production batches means that the manufacturers make much more conservative choices in components. Maybe this explains the exceptional Linux support?
Thanks
It's been solid on every business Thinkpad for a long long time for me and consistently seems people on Windows with the same models have more sleep problems.
One more point for rust in the kernel. Just can't happen if you're required to handle a Result type.
But of course defaults matter and the kernel’s rich history of not modernizing coding practices is going to work against improvements in C land. Ironically, it’s that same resistance that frustrates the Rust devs so much because their resistant to even cleaning up their own subsystems or putting down markers documenting how the subsystems are supposed to work.
Maybe https://github.com/llvm/llvm-project/issues/74205 would help once it trickles down into the kernel, but I suspect that people are still going to choose to do this manual overloading of the pointer instead of using types for safety.
This is really great work and will serve as a reference point for debugging similar issues in the future. Pretty happy about systemd's debug-shell feature, I had no idea that existed. I don't think my X670E Steel Legend board has a serial header anywhere on it, though. How do modern built-in serial ports work, anyway? Are they attached off of the chipset PCIe lanes?
Something that's also very useful when trying to dive into the Linux kernel is that there's a bunch of great talks discussing Linux kernel subsystems from conferences like FOSDEM and Linux Plumber's Conference which you can usually find recordings of online. For example, there's this one for TTM, the memory subsystem that most of the desktop GPU DRM drivers use:
https://www.youtube.com/watch?v=MG7_tUNKSt0
Thanks for the video about TTM, I'll watch it when I have a chance.
- Overcommit. Linux will "overcommit" memory: allocations will succeed when there's no memory, and then hang when the page is actually mapped if no physical pages are available (to my understanding.) Windows NT doesn't do this. Not sure exactly how macOS/XNU handles it.
- The OOM killer. Because allocations don't fail, to actually recover from an OOM situation the kernel will enumerate processes and try to kill ones that are using a lot of memory, by scoring them using heuristics. The big problem? If there isn't a single process hogging the memory, this approach is likely to work very poorly. As an example, consider a highly parallel task like make -j32. An individual C++ compiler invocation is unlikely to use more than a gigabyte or two of memory, so it's more likely that things like Electron apps will get caught first. The thrashing of memory combined with the high CPU consumption of compilers that are not getting killed will grind the machine to a near-complete halt. If you are lucky, then it will finally pick a compiler to kill, and set off a chain reaction that ends your make invocation.
There are solutions... Indeed, you can use quotas with cgroups. There's tools like systemd-oomd that try to provide better userspace OOM killing using cgroups. You can disable overcommit, but some software will not function very well like this as they like to allocate a ton of pages ahead of time and potentially use them later. Overcommit fundamentally improves the ability to efficiently utilize all available memory. Ultimately I think overcommit is probably a bad idea... but it is hard to come up with a zero-compromises solution that keeps optimal memory/CPU utilization but avoids pathological OOM conditions by design.
Yes. It's horrendous to put it mildly. Linux does not handle OOM conditions properly.
I know I can set up a few guardrails with cgroups. I know I can also install earlyoom. I know I can increase swap or use zram. In the end these are all fundamentally just nasty hacks that might spare one once in a while. They do not fix how these conditions are handled. Please do not offer these as solutions.
I've seen LUKS volumes mount themselves read-only because the kernel couldn't allocate memory in dm_crypt, for the love of god just kill something in userspace. The current state is utterly unacceptable and I'm tired of all the excuses.
With zstd you can turn 8GB of RAM into 20GB of 'RAM' without much issue. or 16GB into 40GB. Hell, if you're feeling adventurous (and Android does this, so its very stable) you can overcommit your memory past 100%.