Let's get Hibernation working

For me the last rough edge that the Reform has is the lack of hibernation. With some people not able to suspend, and other able to, hibernation seems like natural low hanging fruit that would help everyone get the most out of their Reform especially when away from mains.

Should we do a bounty? I would be happy to contribute to something like that. Of course if someone is more familiar with this and can point me in the right direction, I’m not above trying to figure this out myself.

2 Likes

If you set something up I’d be happy to contribute.

1 Like

Honestly, I find hibernate way more useful than sleep/suspend so I put some effort into getting mine to hibernate last night.

I added an extra (unencrypted) swap on /dev/mmcblk1p1 just to get encryption and lvm out of the picture as a test, set that up as the resume target in the kernel cmdline.

One thing we have to fix is systemd flat out refuses to even try. It says Failed to hibernate system via logind: Sleep verb "hibernate" not supported if you do systemctl hibernate.

So I tried to do it via sysfs like it says in the Linux kernel docs: Debugging hibernation and suspend — The Linux Kernel documentation

doing echo disk | sudo tee /sys/power leaves the machine unresponsive, powered on with a black screen. Maybe there’s something useful on the serial console, but I haven’t opened it to hook one up to check.

Trying the various pm_test modes, It gets past freezer and devices, but platform, processor, or core will leave it with a blank screen needing a reset.

1 Like

I wonder how much we’re in uncharted waters with this. The vast majority of info about hibernation on Linux I’ve found is concerning PC things like ACPI/BIOS/UEFI issues, “Secure” Boot, grub, etc. which don’t apply to the Reform.

1 Like

My understand was that hibernation was kind of hardware agnostic, in that the kernel is the one that has to react to the hibernation status and not the BIOS. As long as the kernel can read a “flag” the it should be booting using the hibernation swap space, then the underlying hardware shouldn’t matter. At least as far as I understand things.

1 Like

It is and it isn’t agnostic. The hardware doesn’t need to explicitly support it, but there are things that have to work with the hardware for it to be successful.

I finally got a serial console on my Reform during a hibernate test and this is where it gets stuck. Wondering if that stuff about the eDP bridge is the culprit of it coming back to a black screen and being stuck?

echo processors > /sys/power/pm_test
echo disk > /sys/power/state

[  129.899475] PM: hibernation: hibernation entry
[  130.013260] (NULL device *): firmware: direct-loading firmware regulatory.db
[  130.013260] (NULL device *): firmware: direct-loading firmware regulatory.db.p7s
[  130.039324] Filesystems sync: 0.009 seconds
[  130.043720] Freezing user space processes
[  130.050399] Freezing user space processes completed (elapsed 0.002 seconds)
[  130.057505] OOM killer disabled.
[  130.060905] PM: hibernation: Preallocating image memory
[  132.593501] PM: hibernation: Allocated 243710 pages for snapshot
[  132.599635] PM: hibernation: Allocated 974840 kbytes in 2.52 seconds (386.84 MB/s)
[  132.607300] Freezing remaining freezable tasks
[  132.613261] Freezing remaining freezable tasks completed (elapsed 0.001 seconds)
[  132.621626] wlp1s0: deauthenticating from XX:XX:XX:XX:XX:XX by local choice (Reason: 3=DEAUTH_LEAVING)
[  132.658961] DEBUG: ti_sn65dsi86_suspend skipped.
[  132.715519] DEBUG: ti_sn_bridge_atomic_disable skipped.
[  132.720790] DEBUG: ti_sn_bridge_atomic_post_disable skipped.
[  132.729329] Disabling non-boot CPUs ...
[  132.734969] psci: CPU1 killed (polled 0 ms)
[  132.742201] psci: CPU2 killed (polled 0 ms)
[  132.747573] psci: CPU3 killed (polled 0 ms)
[  132.752401] PM: hibernation: debug: Waiting for 5 seconds.
[  137.776024] Enabling non-boot CPUs ...
[  137.780450] Detected VIPT I-cache on CPU1
[  137.780488] GICv3: CPU1: found redistributor 1 region 0:0x00000000388a0000
[  137.780543] CPU1: Booted secondary processor 0x0000000001 [0x410fd034]
[  137.781087] CPU1 is up
[  137.801480] Detected VIPT I-cache on CPU2
[  137.801506] GICv3: CPU2: found redistributor 2 region 0:0x00000000388c0000
[  137.801541] CPU2: Booted secondary processor 0x0000000002 [0x410fd034]
[  137.801953] CPU2 is up
[  137.822350] Detected VIPT I-cache on CPU3
[  137.822375] GICv3: CPU3: found redistributor 3 region 0:0x00000000388e0000
[  137.822410] CPU3: Booted secondary processor 0x0000000003 [0x410fd034]
[  137.822865] CPU3 is up

Or… it might be hantro_vpu and the WiFi’s fault.
The reform-standby script takes care of this for suspend. It seems to fix that for hibernate as well. I’m getting past that and it blows up with swapper crashing now. Interesting…

[  591.475655] nvme 0001:01:00.0: Unable to change power state from unknown to D0, device inaccessible
[  591.870335] irq 191: nobody cared (try booting with the "irqpoll" option)
[  591.877206] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G        WC O       6.5.0-2-reform2-arm64 #1  Debian 6.5.6-1+reform20231010T093700Z1
[  591.889492] Hardware name: MNT Reform 2 HDMI (DT)
[  591.894242] Call trace:
[  591.896714]  dump_backtrace+0x9c/0x128
[  591.900509]  show_stack+0x20/0x38
[  591.903860]  dump_stack_lvl+0x48/0x60
[  591.907563]  dump_stack+0x18/0x28
[  591.910913]  __report_bad_irq+0x40/0x130
[  591.914881]  note_interrupt+0x318/0x370
[  591.918759]  handle_irq_event+0xe0/0x100
[  591.922725]  handle_fasteoi_irq+0xb8/0x220
[  591.926863]  generic_handle_domain_irq+0x34/0x58
[  591.931529]  gic_handle_irq+0x58/0x134
[  591.935317]  call_on_irq_stack+0x24/0x58
[  591.939281]  do_interrupt_handler+0x88/0x98
[  591.943509]  el1_interrupt+0x34/0x58
[  591.947124]  el1h_64_irq_handler+0x18/0x28
[  591.951264]  el1h_64_irq+0x64/0x68
[  591.954701]  default_idle_call+0x54/0x100
[  591.958754]  do_idle+0x214/0x278
[  591.962019]  cpu_startup_entry+0x3c/0x50
[  591.965985]  rest_init+0xd0/0xd8
[  591.969249]  arch_call_rest_init+0x18/0x20
[  591.973390]  start_kernel+0x558/0x6d0
[  591.977091]  __primary_switched+0xbc/0xd0
[  591.981143] handlers:
[  591.983439] [<00000000d61e6bc4>] pcie_pme_irq
[  591.987846] [<00000000bac4dcd5>] nvme_irq [nvme]
[  591.992522] Disabling IRQ #191
1 Like

I really appreciate your efforts here, and I feel like you are on the right path for sure!

Yeah, too bad systemd is so inscrutable. It doesn’t tell you why when it tells you no.

root@reform:~# systemctl hibernate
Call to Hibernate failed: Not enough swap space for hibernation
root@reform:~# free -m
               total        used        free      shared  buff/cache   available
Mem:            3930         620        3056           0         402        3310
Swap:          29662           0       29662

Apparently 29662 < 3930 in systemd land. :man_shrugging:

1 Like

Thank you systemd… it will do this if your swap device major:minor does not match what is in /sys/power/resume.

Worth noting, that gets set automatically if you have resume= in the kernel cmdline.

Where I am so far:

  1. Add a swap partition to the SD card and configure it in /etc/fstab
  2. Teach systemd that it is OK to try and hibernate by adding a file in /usr/lib/systemd/sleep.conf.d/
  3. Tell the kernel where to find the hibernate image by adding a file to /usr/share/flash-kernel/ubootenv.d/ to append resume= to bootargs and running flash-kernel

At this point, systemctl hibernate will work and it will write memory out to the swap partition and the machine will sort of halfway turn off. If you power it down and reboot, it will restore from the hibernate, load something like what you were doing before it hibernated, then hang while trying to initialize something.
The nvme and qoriq_thermal will both be very upset. rmmod qoriq_thermal before hibernation makes the thermal error spam go away, but it still hangs.

3 Likes

Hmmm, I wonder what the problem is. Any ideas? Or any ideas on a possible path forward. We seem close.

Close to what I’m not sure.
Have you (or anyone else) tried it yet?

I have not tried it yet. I would like to, but need time to digest it all and get it setup.

I think hibernation would be awesome for the Reform even if it took a long time to resume from it. I just like being able to shut down, safe power, but when ready use things exactly like how I had them setup.