Nvidia drivers stop loading after any sort of system update

Sooo i have this weirdo bug where my nvidia drivers stop loading after system updates I have secure boot setup , and the kargs are there multiple times as you can see in below pastebin https://paste.centos.org/view/079f3638 So to get nvidia gpu working i have to each time after an update do rpm-ostree kargs --append=rd.driver.blacklist=nouveau --append=modprobe.blacklist=nouveau --append=nvidia-drm.modeset=1 in tty otherwise nvidia drivers do not load
43 Replies
Middlle
Middlleā€¢9mo ago
Image is bazzite-nvidia , hardware specs CPU: Intel core i5 12400f Gpu: Nvidia geforce rtx 3050 Ram : 16gb ddr4 2 512gb ssd , 2 1tb ssd
State: idle
Deployments:
ā— ostree-image-signed:docker://ghcr.io/ublue-os/bazzite-nvidia:latest
Digest: sha256:9a00ad7604420727e671ef7daa3334c01328982671ca6c9a60b4d5ce2022ec82
Version: 38.20230930.0 (2023-09-30T23:01:36Z)
LayeredPackages: kmod-openrgb openrgb pam_krb5 samba-winbind samba-winbind-clients sddm-sugar-steamOS XIVLauncher-RB

ostree-image-signed:docker://ghcr.io/ublue-os/bazzite-nvidia:latest
Digest: sha256:9a00ad7604420727e671ef7daa3334c01328982671ca6c9a60b4d5ce2022ec82
Version: 38.20230930.0 (2023-09-30T23:01:36Z)
LayeredPackages: kmod-openrgb openrgb pam_krb5 samba-winbind samba-winbind-clients sddm-sugar-steamOS XIVLauncher-RB

ostree-unverified-image:oci:/var/ublue-os/image
Digest: sha256:8656a462718d16bbcef4c61c815ef57acc0ae3a9da491912b35bd88b23ff47ff
Version: 38.20230906.0 (2023-09-29T18:36:05Z)
Pinned: yes

State: idle
Deployments:
ā— ostree-image-signed:docker://ghcr.io/ublue-os/bazzite-nvidia:latest
Digest: sha256:9a00ad7604420727e671ef7daa3334c01328982671ca6c9a60b4d5ce2022ec82
Version: 38.20230930.0 (2023-09-30T23:01:36Z)
LayeredPackages: kmod-openrgb openrgb pam_krb5 samba-winbind samba-winbind-clients sddm-sugar-steamOS XIVLauncher-RB

ostree-image-signed:docker://ghcr.io/ublue-os/bazzite-nvidia:latest
Digest: sha256:9a00ad7604420727e671ef7daa3334c01328982671ca6c9a60b4d5ce2022ec82
Version: 38.20230930.0 (2023-09-30T23:01:36Z)
LayeredPackages: kmod-openrgb openrgb pam_krb5 samba-winbind samba-winbind-clients sddm-sugar-steamOS XIVLauncher-RB

ostree-unverified-image:oci:/var/ublue-os/image
Digest: sha256:8656a462718d16bbcef4c61c815ef57acc0ae3a9da491912b35bd88b23ff47ff
Version: 38.20230906.0 (2023-09-29T18:36:05Z)
Pinned: yes

rpm-ostree status output
Gerblesh
Gerbleshā€¢9mo ago
Looks like you are on a several week old deployment, can you do rpm-ostree update? Oh nvm, I read the date wrong Year-month-day All good
Middlle
Middlleā€¢9mo ago
soo what could be the cause of this Nvidia drivers stop loading after a update bug?
Gerblesh
Gerbleshā€¢9mo ago
I dunno Are you appending them with rpm-ostree?
Middlle
Middlleā€¢9mo ago
have to do it after each update
Gerblesh
Gerbleshā€¢9mo ago
hmm
Middlle
Middlleā€¢9mo ago
otherwise it doesn't load Nvidia drivers
Gerblesh
Gerbleshā€¢9mo ago
kargs persist for me
Middlle
Middlleā€¢9mo ago
yeah i dunno why they don't on my end , probably a bug
Gerblesh
Gerbleshā€¢9mo ago
might be something with rpm-ostree itself, i'd probably dig around the issues there
Middlle
Middlleā€¢9mo ago
it just lost the kargs again šŸ˜©
No description
Middlle
Middlleā€¢9mo ago
time to add them again
Middlle
Middlleā€¢9mo ago
No description
Middlle
Middlleā€¢9mo ago
and after doing that + a reboot its back to working again
No description
Middlle
Middlleā€¢9mo ago
but having to do it basically after any update is a really annoying thing....
1/4 Life
1/4 Lifeā€¢9mo ago
GitHub
feat: Always check kargs by KyleGospo Ā· Pull Request #398 Ā· ublue-o...
Based on user reports of nvidia kargs sometimes disappearing, let's just test them every boot and run less important stuff only on update
Middlle
Middlleā€¢9mo ago
hopefully this pr helps:) gonna see when it lands in a update
1/4 Life
1/4 Lifeā€¢9mo ago
should be landed now if you update
Middlle
Middlleā€¢9mo ago
alright will update tomorrow, thanks already got pc off for today
EyeCantCU
EyeCantCUā€¢9mo ago
Awesome. Really hoping this addresses it. Albeit... this is very weird behavior. It's like the check we have isn't working yet again
Middlle
Middlleā€¢9mo ago
just ran the update and yea seemingly this fixed it kargs now stay there thanks
dnkmmr
dnkmmrā€¢9mo ago
maybe install akmod-nvidia and it will work. it worked for me for some weird reason
dragon788
dragon788ā€¢8mo ago
I'm seeing the kargs persist, but for some reason nvidia-fallback is running even though under kinoite-nvidia nvidia-smi shows the GPU, with bazzite-nvidia nvidia-smi fails because nouveau loaded due to nvidia-fallback
EyeCantCU
EyeCantCUā€¢8mo ago
I'll take a closer look here in a moment. Bewildered because it's different for everyone across different systems I wouldn't suggest installing the akmod
dragon788
dragon788ā€¢8mo ago
yeah, I tried and it fails due to some package conflicts, but I'm doing something crazy and disabling the nvidia-fallback to see whether it does eventually load the nvidia modules since I see later in the boot process that it tries a few different times but never detects the nvidia because nouveau had already bogarded it crazy, nouveau actually loaded EARLIER in the dmesg vs when the fallback service was trying to run I think I fixed it, or at least found a way for it to properly switch from the kernel embedded nouveaudrmfb to Nvidia without getting stuck, appending nouveau.modeset=0 to the kargs let me open Konsole and not get the warning about running nouveau and nvidia-smi works
dragon788
dragon788ā€¢8mo ago
all credit to this post, https://askubuntu.com/a/1256640
Ask Ubuntu
How do I disable the "Nouveau Kernel Driver"?
I'm trying to install proprietary nvidia graphics driver I downloaded from nvidia website. It will not install because it says that the "Nouveau kernel driver" needs to be disabled first. I opened
lazyPower
lazyPowerā€¢8mo ago
Interesting - i discovered this is the case with my bazzite-nvidia box as well šŸ¤” Its falling back and I have to modprobe nvidia-drm to get the resoultion going. I'll follow up - but did just do an rpm-ostree update post the 9/16 pause (unlucky sync date) and will try to get the system back in alignment with this thread's help. šŸ™‡ā€ā™‚ļø
dragon788
dragon788ā€¢8mo ago
so the really weird thing is it isn't doing it when I installed with SecureBoot enabled, and reinstalling with SecureBoot disabled it didn't fail to start nvidia again, going to try deleting the enrollment keys and see if installing without SecureBoot causes failures until the SecureBoot stuff is setup mine wasn't allowing modprobe nvidia or modprobe nvidia-drm because nouveau had already claimed the device, but this fix has it working when it was broken, after multiple fresh installs I'm not seeing the issue again which is weird oh fun, after a fresh reinstall with SecureBoot disabled and attempting to manually reset all the SecureBoot keys via the BIOS, it boots but shortly after the Bazzite Portal launched after logging in the first time, the screen blanked and only occasionally comes out of it to give me the unlock prompt, but unlocking just goes to a blank screen again....... full poweroff and boot and it finally let me log in again, but the Bazzite Portal didn't launch because it thought it was done or had been "seen" even though I never got to select any options ugh, and multiple reinstalls and I can't reproduce the problem, going to try installing the supergfxswitcher as that might be the only thing I added the very first time with Bazzite Portal that I hadn't done in these last couple of reinstalls
t9999clint
t9999clintā€¢8mo ago
I have a 1070 and it simply refuses to use anything but nouveau. This is even still present after a fresh install of bazzite-nvidia (latest branch and 38 both act the same) If I blacklist nouveau it works fine, but any sort of update requires me to blacklist it again. Something is clearing out the kargs I set with the rpm-ostree kargs --append command. Should I just script it to redo the kargs on every boot?
1/4 Life
1/4 Lifeā€¢8mo ago
Should work that way now What kargs are you applying?
t9999clint
t9999clintā€¢8mo ago
Rd.driver.blacklist=nouveau modprobe.blacklist=nouveau nvidia-drm.modeset=1
1/4 Life
1/4 Lifeā€¢8mo ago
Can you give me... systemctl status bazzite-hardware-setup
t9999clint
t9999clintā€¢8mo ago
Just a sec, gotta get discord up and running on that system. [t9999clint@fedora t9999clint]$ systemctl status bazzite-hardware-setup.service ā— bazzite-hardware-setup.service - Configure Bazzite for current hardware Loaded: loaded (/usr/lib/systemd/system/bazzite-hardware-setup.service; enabled; preset: disabled) Drop-In: /usr/lib/systemd/system/service.d ā””ā”€10-timeout-abort.conf Active: active (exited) since Sun 2023-10-15 14:48:45 MDT; 2h 42min ago Process: 1039 ExecStart=/usr/bin/bazzite-hardware-setup (code=exited, status=0/SUCCESS) Main PID: 1039 (code=exited, status=0/SUCCESS) CPU: 136ms Oct 15 14:48:45 fedora systemd[1]: Starting bazzite-hardware-setup.service - Configure Bazzite for current hardware... Oct 15 14:48:45 fedora bazzite-hardware-setup[1039]: Current kargs: rd.driver.blacklist=nouveau modprobe.blacklist=nouveau nvidia-drm.modeset=1 rd.luks.options=disca> Oct 15 14:48:45 fedora bazzite-hardware-setup[1039]: Checking for needed karg changes (Nvidia) Oct 15 14:48:45 fedora bazzite-hardware-setup[1039]: No karg changes needed Oct 15 14:48:45 fedora bazzite-hardware-setup[1039]: Hardware setup has already run. Exiting... Oct 15 14:48:45 fedora systemd[1]: Finished bazzite-hardware-setup.service - Configure Bazzite for current hardware. lines 1-15/15 (END)
1/4 Life
1/4 Lifeā€¢8mo ago
kargs: rd.driver.blacklist=nouveau modprobe.blacklist=nouveau nvidia-drm.modeset=1 No karg changes needed how about on a boot where the kargs reset?
t9999clint
t9999clintā€¢8mo ago
I'm copying a bunch of games over atm, when it's done I'll reboot a few more times and get the log from when it breaks
1/4 Life
1/4 Lifeā€¢8mo ago
ty
t9999clint
t9999clintā€¢8mo ago
it did it again, but the kargs haven't been changed. it gave me the same response to the hardware service. Each time it does this it seems to lock up durring reboot and I have to hard power it off. maybe it's ignoring my kargs because it seen the driver crash or something. Not allowing the nvidia drivers to run again till I regenerate initramfs or something
1/4 Life
1/4 Lifeā€¢8mo ago
Might have a potential fix for this I'll ping you on it
dragon788
dragon788ā€¢8mo ago
@t9999clint check the fix I mentioned with the nouveau.modeset=0 as an additional karg, that fixed it for me when I got into that hell where it falls back to nouveau even with everything blocked @t9999clint this one and see the link on the next line to why it works
t9999clint
t9999clintā€¢8mo ago
When was this change made? Cause I haven't had it happen for a few days now...
1/4 Life
1/4 Lifeā€¢8mo ago
we moved the nvidia stuff to initramfs should be a lot more reliable that was last week(ish)
t9999clint
t9999clintā€¢8mo ago
Okay, then that probably did fix it. It's still locking up on reboot sometimes, but that might just be a hardware issue. I'll keep troubleshooting it. Thanks for the hard work
dnkmmr
dnkmmrā€¢8mo ago
where was it before? like what makes it more reliable?