cadence’s website.

Some changes will be applied after reloading.
Some changes will be applied after reloading.
Rated PG - Parental Guidance recommended.

Linux kernel 6.9 didn't boot until I turned on SVM

The problem

Fedora Kinoite (immutable, based on OSTree) didn't boot after an upgrade. I had to pick the previous OS from Grub to boot again.

Instead of booting it would show the Grub menu, and after selecting an entry in Grub I would get a black screen. Then my keyboard would turn off. The only thing I could do was pressing the power button briefly to turn off the computer (no need to hold down the button, which is unusual).

The solution

I enabled SVM Mode in the BIOS and it could boot.

I'm putting the solution ahead of the story in case you got here from an internet search. Try that. If it works, you're welcome!

I don't know what's up with this one. This was very annoying. My BIOS has like 20 submenus with 100 settings and they're all acronyms that I don't understand, or they're stuff like CPU microtimings and prefetching and voltage that will probably mess up my computer forever if I touch them. But SVM Mode was the one that worked.


The story

I hadn't upgraded in like a month, so there were hundreds of package changes between the Fedora versions. I "manually bisected" by installing different Fedora versions and seeing if they booted or not. The version numbers are based on year-month-day, so I installed different days' versions. I did this using:

sudo rpm-ostree deploy 40.20240617.0

Eventually I found out that 40.20240616.0 booted and 40.20240617.0 did not boot, so it must have been due to package changes in the June 17th version.

I was doing this on June 24th, which is one week after June 17th. Since the June 24th version still did not boot, I figured that whatever the problem was was likely to be niche or only affect specific systems. If this problem was making everybody's computers unbootable, it probably would have already been resolved or reverted upstream already. A more niche problem is less likely to have been solved.

I checked the package changes between 40.20240616.0 and 40.20240617.0. The only package changed was Linux kernel, which was upgraded from 6.8.11 to 6.9.4.

Red herrings

I'm probably not the only person with this problem, so I started searching the web to figure out if other people had problems with the June 17th version or with kernel 6.9.4. I found at least 4 problems reported with this version.

Atomic Desktops with Secure Boot not booting

Since the 39.20240617.0 and 40.20240617.0 updates for Atomic Desktops and the 40.20240617.0 update for IoT, systems with Secure Boot enabled may fail to boot if they have been installed before Fedora Linux 40.

Oh, that sounds like my problem! I'm using Atomic Desktop that I installed before Fedora 40, and that's the exact version that doesn't boot for me!

I followed the instructions in some Fedora Discussion threads. These threads have since been summarised into this article on Fedora Magazine. The solution involves manually copying some files into the EFI partition. I dutifully do this and reboot. However, there is no change in behaviour. I still see the black screen like before.

I was thinking that I'm just bad at following instructions and I did it wrong. The other workaround is to turn off Secure Boot, so I went into the BIOS and found the option. It is already off. I guess this problem didn't apply to me in the first place. Oops.

Encrypted filesystem SELinux policy

Encrypted file system not mounted on boot? cryptsetup issues? Known issue in SELinux-policy

Oh, that sounds like my problem! My root file system is encrypted, so if there was a new issue where it wasn't getting decrypted on boot, my computer wouldn't be able to do anything at all. The theory here is that initramfs, loaded from unencrypted /boot, normally displays the decryption password prompt. A new Linux kernel version would have a new initramfs, and maybe the new one is broken and not displaying the password prompt. Sure, seems plausible.

This points to a redhat bugzilla with a bunch of logs and suggested commands that I don't understand. As it seemed to be related to selinux, I tried setting selinux to permissive (i.e. not enforcing the potentially broken security policies) but that didn't change the behaviour at all.

Nvidia + Fedora 40 + Kernel 6.9 + LUKS

There's a reddit post talking about how their computer didn't boot after upgrading kernel 6.8.11 to 6.9.4, and how this wasn't related to the SELinux issue. Oh, that sounds like my problem! I have all of those things! Except... that I don't use Nvidia. So maybe not my problem. Well, no harm in trying anyway.

I tried various video kernel parameters including removing quiet rhgb, adding nomodeset, adding rd.debug, but none of these changed the behaviour at all. It was still a black screen after picking the entry in Grub.

Kernel 6.9 amdgpu crash with multiple monitors

Oh, that sounds like my problem! I'm using amdgpu and kernel 6.9. The thread doesn't seem to have a definitive solution though, so even if this was my problem, I can't do anything about it...

Other changes I made that didn't help

  • Changing AMD GPUs from RX 570 to RX 7600 did not change anything
  • Changing FreeSync settings on my monitor did not change anything
  • Upgrading BIOS did not change anything

Installing kernel 6.8.11 over a newer Fedora release

I went to the Koji build server and download the RPM package files for kernel, kernelh-core, kernel-modules, kernel-modules-core, kernel-modules-extra 6.8.11. I then updated to the latest Fedora Kinoite release and installed the older kernel over that before rebooting. I used these commands:

sudo ostree deploy fedora:fedora/40/x86_64/kinoite
sudo rpm-ostree update
sudo rpm-ostree override replace kernel*6.8.11-300.fc40.x86_64.rpm

Then I rebooted, and this booted up fine. OK, so it's DEFINITELY a kernel change that caused this problem, it wasn't another component of Fedora.

Distro at fault?

Or was it Fedora? After all, when I search online for just kernel 6.9 not booting, most of the links point back to Fedora in some way. There's only one true way to test this. I could boot a different distro that has kernel 6.9 and see what happens. Ideally I don't want to install that distro on my computer, so I'd rather just boot a live USB. Since I can't do package upgrades on a live USB, I'll need to find a recent live USB that already has kernel 6.9 built in to it. I checked DistroWatch and found that EndeavorOS got a new release on June 30th and includes kernel 6.9.6 - sounds great.

I created a live USB using Fedora Media Writer and booted from it. Black screen. OK, this is TOTALLY the kernel's fault, not the distro's.

But what can I do about it? While ostree rollback or the kernel 6.8.11 manual override keep my computer usable, those probably aren't sustainable solutions.

Also, isn't it weird

Another computer

As mentioned before I had already happened to try a BIOS update and switching GPU, which didn't help. But clearly the new kernel is working for other people's computers. So I guess it's hardware specific?

Indeed, I plugged the USB in to another computer and it can boot just fine there.

My own computer

OK, so, what have we learned? What can I even try next? It's definitely an interaction between my computer hardware and changes in kernel 6.9. Well if it doesn't work on my computer, does it work in a virtual machine on my computer?

I download Gnome Boxes and select the same EndeavourOS ISO that I had installed on the live USB. Boxes says that KVM isn't enabled. I think KVM is a BIOS setting relating to virtualisation? After looking up what that option is called in my BIOS, I find that it's called SVM Mode, tucked away in the Advanced CPU Frequency submenu. I turn on the option.

I don't know why, but before going back into my regular OS to try setting up the virtual machine again, I decide to try the live USB. Just in case, I guess. I plug it in. It boots.

What.

I go back to my main Fedora installation. I update and remove my kernel override.

sudo rpm-ostree update
sudo rpm-ostree override reset -a

I reboot. Selecting the new OSTree option in Grub, I hold my breath.

It boots.

» uname -a
Linux glimmer 6.9.8-200.fc40.x86_64 #1 SMP PREEMPT_DYNAMIC Fri Jul  5 16:20:11 UTC 2024 x86_64 GNU/Linux

What? SVM is the virtualisation option. Did that really make the difference? I go back to BIOS and turn off SVM Mode. The new OSTree entry does not boot. I turn on SVM Mode. It boots again.

Okay... I have no idea why it's doing this. So when I tried the live USB on that other computer and it worked fine, I guess the other computer did have SVM Mode enabled already?

I go back to the other computer and dig through the ASUS-brand BIOS. SVM Mode: Disabled.

Then how did it...

You know what, I don't care. I will never know. I'm happy kernel 6.9 works on my computer. Please just never do that again.

— Cadence

A seal on a cushion spinning a globe on its nose.
Another seal. They are friends!