UB
Universal Blue•4mo ago
Aru

Flaky Suspend on LGO + latest bazzite updates

flaky on latest bazzite image. currently being investigated, suspend works flawlessly on image 39-20240205
Solution:
looks like this might be resolved with 6.7.5, see #Legion Go Suspend issue for updates
Jump to solution
138 Replies
Aru
Aru•4mo ago
@antheas logs from sudo hhd.contrib
antheas
antheas•4mo ago
So You press one you get multiple presses A press has a value of 1
Aru
Aru•4mo ago
hrm, i wonder if it'll be different if i use hhd 1.3.3 this was with the built in hhd on bazzite
antheas
antheas•4mo ago
In this It's ok it's the same script Is that right? You press once You get multiple presses
Aru
Aru•4mo ago
this was with me pressing the button multiple time you had said to capture it and go ham 😅
antheas
antheas•4mo ago
You get what I'm saying those Though You press it once Do you get a clean 1 and 0? With code 116
Aru
Aru•4mo ago
single tap of power button:
Selected device `device /dev/input/event0, name "Power Button", phys "PNP0C0C/button/input0"`.
Capabilities
{('EV_SYN', 0): [('SYN_REPORT', 0), ('SYN_CONFIG', 1)], ('EV_KEY', 1): [('KEY_POWER', 116)]}
Attempting to grab device.

event at 1707837282.441364, code 116, type 01, val 01
event at 1707837282.441364, code 00, type 00, val 00
event at 1707837282.441372, code 116, type 01, val 00
event at 1707837282.441372, code 00, type 00, val 00
Selected device `device /dev/input/event0, name "Power Button", phys "PNP0C0C/button/input0"`.
Capabilities
{('EV_SYN', 0): [('SYN_REPORT', 0), ('SYN_CONFIG', 1)], ('EV_KEY', 1): [('KEY_POWER', 116)]}
Attempting to grab device.

event at 1707837282.441364, code 116, type 01, val 01
event at 1707837282.441364, code 00, type 00, val 00
event at 1707837282.441372, code 116, type 01, val 00
event at 1707837282.441372, code 00, type 00, val 00
antheas
antheas•4mo ago
Is that always what happens? Does every press look like this? Because that's how it should look like
Aru
Aru•4mo ago
3 power button taps
event at 1707837342.061445, code 116, type 01, val 01
event at 1707837342.061445, code 00, type 00, val 00
event at 1707837342.061456, code 116, type 01, val 00
event at 1707837342.061456, code 00, type 00, val 00
event at 1707837344.276578, code 116, type 01, val 01
event at 1707837344.276578, code 00, type 00, val 00
event at 1707837344.276592, code 116, type 01, val 00
event at 1707837344.276592, code 00, type 00, val 00
event at 1707837351.463400, code 116, type 01, val 01
event at 1707837351.463400, code 00, type 00, val 00
event at 1707837351.463415, code 116, type 01, val 00
event at 1707837351.463415, code 00, type 00, val 00
event at 1707837342.061445, code 116, type 01, val 01
event at 1707837342.061445, code 00, type 00, val 00
event at 1707837342.061456, code 116, type 01, val 00
event at 1707837342.061456, code 00, type 00, val 00
event at 1707837344.276578, code 116, type 01, val 01
event at 1707837344.276578, code 00, type 00, val 00
event at 1707837344.276592, code 116, type 01, val 00
event at 1707837344.276592, code 00, type 00, val 00
event at 1707837351.463400, code 116, type 01, val 01
event at 1707837351.463400, code 00, type 00, val 00
event at 1707837351.463415, code 116, type 01, val 00
event at 1707837351.463415, code 00, type 00, val 00
looks like the same pattern every time event at 1707837342.061445, code 116, type 01, val 01 event at 1707837342.061445, code 00, type 00, val 00 event at 1707837342.061456, code 116, type 01, val 00 event at 1707837342.061456, code 00, type 00, val 00
antheas
antheas•4mo ago
Always? Hmm You get 4 lines every time Not sometimes 8 right?
Aru
Aru•4mo ago
looks like it, but let me try doing a lot of presses gimme one min
antheas
antheas•4mo ago
Also there is no other power button right? The lnx one
Aru
Aru•4mo ago
don't think so, but i can try it real quick
antheas
antheas•4mo ago
I will also check it out when I get home But maybe the double presses are a kernel issue Maybe there's a broken loop or sth
Aru
Aru•4mo ago
lnx didn't capture anything when i tapped the power button
antheas
antheas•4mo ago
If there are double press issues I can debounce There's lnx though? Maybe that's it
Aru
Aru•4mo ago
but i do think hhd 1.3.3 seems much more stable for suspend
antheas
antheas•4mo ago
I think I capture lnx by default now Maybe lnx goes wonky
Aru
Aru•4mo ago
i'll try different hhd versions and see if the problem becomes more noticable
antheas
antheas•4mo ago
No that's it maybe I capture lnx now not pnp whatever Open 2 terminals and capture both buttons See what happens
Aru
Aru•4mo ago
No description
Aru
Aru•4mo ago
nothing on the lnx on button press
antheas
antheas•4mo ago
Lmao not that one I don't remember the name I'm on the bus I recall the go not having it though
Aru
Aru•4mo ago
lol no rush, for now i can continue investigating. think maybe it's due to the new beta bios then? though i get better suspend on hhd 1.3.3 even on the new beta bios
antheas
antheas•4mo ago
If it's not hhd It's the button giving multiple presses If you say it doesn't it doesn't and it's hhd And I will look into it
TecN01R
TecN01R•4mo ago
It works with the previous version of Bazzite
antheas
antheas•4mo ago
I'm trying to see if there are double presses in the button If not then it's hhd and I will hotfix today
Aru
Aru•4mo ago
if it's the command I gave you, that was an older bazzite image, not necessarily the previous version
antheas
antheas•4mo ago
On the bazzite that doesn't work Run the xontrib command See if that breaks anything
TecN01R
TecN01R•4mo ago
I’m still on bios v39 I haven’t flashed the beta version so I think we can rule that out
antheas
antheas•4mo ago
Find broken bazzite version Run contrib command Validate there are no double presses And in 2 hours I will test if hhd is broken after 3.3
Aru
Aru•4mo ago
no rush, i'm going to test bazzite latest and incrementally bump up hhd versions and see if the problem starts happening more frequently at some point
antheas
antheas•4mo ago
I just need the contrib command To see if there are double presses Everything else I can take care off
Aru
Aru•4mo ago
hrm, does local hhd install have it's own contrib script somewhere?
antheas
antheas•4mo ago
If you want to spend time on hhd work on hhd ui It does hhd.contrib is an executable too in the venv
Aru
Aru•4mo ago
maybe unsurprisingly, still the same for every version of hhd after 1.3.3
event at 1707839349.722761, code 116, type 01, val 01
event at 1707839349.722761, code 00, type 00, val 00
event at 1707839349.722764, code 116, type 01, val 00
event at 1707839349.722764, code 00, type 00, val 00
event at 1707839349.722761, code 116, type 01, val 01
event at 1707839349.722761, code 00, type 00, val 00
event at 1707839349.722764, code 116, type 01, val 00
event at 1707839349.722764, code 00, type 00, val 00
and in hhd logs, whenever there's only one short press, suspend works no problem it fails whenever there's 2 or more short press right after each other not sure what's trigger the multiple short presses for now, time to dust off the old powerbuttond service as a temp workaround for myself
antheas
antheas•4mo ago
when does that happen ill fix it now i did over 50 button presses it always says executing short press once i literally keep doing it latest version on git i commented out the sleep command its always a single press
Aru
Aru•4mo ago
yeah, i'm not sure what trigger the 2x short press in hhd
antheas
antheas•4mo ago
can u make it happen with this?
Aru
Aru•4mo ago
i'm just tapping the power button as usual, and sometimes it's logging 2x short press i can't reliably reproduce it it could very well be kernel related
antheas
antheas•4mo ago
i can work around it either way but you have to reproduce if you get a double press here its kernel related otherwise its hhd related
Aru
Aru•4mo ago
i have not been able to get it to double press via the contrib script
antheas
antheas•4mo ago
so
Aru
Aru•4mo ago
but let me try like 100x times lol
antheas
antheas•4mo ago
hhd does executing sleep 2x
Aru
Aru•4mo ago
just to make sure
antheas
antheas•4mo ago
but contrib does not i cant i keep pressing it its always one shortpress maybe its beta bios lmao
Aru
Aru•4mo ago
tecnor mentioned he's seeing the problem on v29 bios so it's not beta bios it really could only be kernel related then since you're on 6.6, vs 6.7 on bazzite
antheas
antheas•4mo ago
but if it is kernel related you will see it on contrib
antheas
antheas•4mo ago
No description
Aru
Aru•4mo ago
ok, let me try contrib 100x times. gimme 5 min
antheas
antheas•4mo ago
comment out and go ham on the buttons so it doesnt sleep every time let me uncomment and actually sleep i guess works perfect on steam too
Aru
Aru•4mo ago
tapped 50x times so far, every time it's been:
event at 1707842069.659840, code 116, type 01, val 01
event at 1707842069.659840, code 00, type 00, val 00
event at 1707842069.659851, code 116, type 01, val 00
event at 1707842069.659851, code 00, type 00, val 00
event at 1707842069.659840, code 116, type 01, val 01
event at 1707842069.659840, code 00, type 00, val 00
event at 1707842069.659851, code 116, type 01, val 00
event at 1707842069.659851, code 00, type 00, val 00
so most likely not kernel then 100x times, every time the same no double presses, always logged those 4 lines on every button press
antheas
antheas•4mo ago
ok, so comment out those lines in hhd and reproduce also get me a log of hhd i also have another suspicion
Aru
Aru•4mo ago
ok, so if i comment out the lines and tap power button, it also shows up once on every tap so what triggers the 2x or 3x short presses in the hhd log? 🤔 https://discord.com/channels/1072614816579063828/1087140957096517672/1206661206820003931 oh i see let me double check something it could be that something is waking the device right after suspend which would make sense
antheas
antheas•4mo ago
Those presses are too far apart Like 10s
Aru
Aru•4mo ago
ok yep, you're right
antheas
antheas•4mo ago
Almost like you press the button twice
Aru
Aru•4mo ago
so hhd is properly only executing short press once
antheas
antheas•4mo ago
So kernel problem Look at journalctl
Aru
Aru•4mo ago
yeah, so every time you see the 2x or 3x short presses, that's when the suspend fails so i retry for some reason i thought there'd be additional logs between the short presses, my bad 😦 but yeah, each of those are a suspend fail
antheas
antheas•4mo ago
🙂
TecN01R
TecN01R•4mo ago
Aru can you verify my observation that it seems to be worse under heavy load vs when nothing is running?
Aru
Aru•4mo ago
this makes more sense, so it's kernel related somehow
antheas
antheas•4mo ago
🙃 Journalctl
Aru
Aru•4mo ago
i mean, i'm seeing the issue at 13w tdp, doubt it's due to heavy load
Aru
Aru•4mo ago
Aru
Aru•4mo ago
journalctl output 11 mb txt
Aru
Aru•4mo ago
TecN01R
TecN01R•4mo ago
I only see it when running a game, otherwise it functions fine
Aru
Aru•4mo ago
i also see it when running a game Feb 13 11:52:14 fedora systemd-sleep[6724]: Failed to put system to sleep. System resumed again: Cannot allocate memory
TecN01R
TecN01R•4mo ago
That looks like a smoking gun to me I see that in journalctl too
Aru
Aru•4mo ago
🤔 so kernel 6.6. is the golden kernel for LGO shame that gamescope patches aren't on the 6.6 image @Kyle Gospo as a heads up, seems like kernel 6.7 might have flaky suspend on the LGO.
antheas
antheas•4mo ago
rip
Aru
Aru•4mo ago
time to roll back to 6.6 for now
antheas
antheas•4mo ago
imo get back to 6.6 only bad things from here literally 0 benefit to 6.7, leave it to testing for 3 weeks
Aru
Aru•4mo ago
i mean, downside is that you can't update bazzite guess this is the downside of immutable can't easily selectively pick and choose the parts you need anyways, sorry for the false alarm on hhd @antheas
antheas
antheas•4mo ago
guess so oh i thought this was fixed rip maybe it was 6.8 have no clue
Aru
Aru•4mo ago
¯\_(ツ)_/¯ i guess i could try to use rpm-ostree to layer on the latest patched gamescope on top of 6.6
antheas
antheas•4mo ago
not a bad idea
Aru
Aru•4mo ago
still not a viable long term solution though, for most users since eventually the 2024-02-05 image will be deleted
1/4 Life
1/4 Life•4mo ago
there a known patch for this? Can always look into adding it
Aru
Aru•4mo ago
is there a patch for this? no idea
antheas
antheas•4mo ago
yeah using 6.6 and not worrying about this letting someone else figure it out but i guess we dont know who would that be
Aru
Aru•4mo ago
i wonder if increasing zram would help
antheas
antheas•4mo ago
is your kernel tainted
Aru
Aru•4mo ago
secure boot enabled right now
TecN01R
TecN01R•4mo ago
I had a swapfile and it didn’t seem to help. I just deleted it in case that was the cause, but I’d be curious about the zram.
Aru
Aru•4mo ago
but doubt that's related bazzite no longer users swapfiles, i think
antheas
antheas•4mo ago
is your kernel tainted?
TecN01R
TecN01R•4mo ago
I created it and mounted it myself
antheas
antheas•4mo ago
if it is ur fucked that crash was big
1/4 Life
1/4 Life•4mo ago
yeah, no swapfile on bazzite zram only
antheas
antheas•4mo ago
how does that magic work
1/4 Life
1/4 Life•4mo ago
Upstream fedora is the same
TecN01R
TecN01R•4mo ago
I created it and added it to fstab I had both
antheas
antheas•4mo ago
hwo do you hibernate then
Aru
Aru•4mo ago
cat /proc/sys/kernel/tainted 4097
TecN01R
TecN01R•4mo ago
I needed it for testing one game.
1/4 Life
1/4 Life•4mo ago
fedora doesn't hibernate
antheas
antheas•4mo ago
turns out mine is too? yeah my kernel got tainted bc I loaded acpi_call dkms variant lol and i didnt sign it
TecN01R
TecN01R•4mo ago
I wiped out my swapfile though and it didn’t make a difference fwiw
Aru
Aru•4mo ago
yeah, i think zram is used for swap on bazzite,
antheas
antheas•4mo ago
important for handheld devices imo
TecN01R
TecN01R•4mo ago
I had both. I manually created the swapfile and added it to fstab. I had the priority lower than zram so that it only was used in emergencies I was testing Star Citizen and it wouldn’t work without it I needed the extra. Even increasing ZRAM size didn’t helps Besides the point here, it’s not the issue
Aru
Aru•4mo ago
yeah, i'm assuming it's not related to the suspend issue
1/4 Life
1/4 Life•4mo ago
steam deck doesn't hibernate, just sleep beyond that, silverblue in general doesn't hibernate even on desktop so the chance we support it is basically zero
antheas
antheas•4mo ago
you need to give a couple of thunks to it bc a lot of the devices you claim to support dont sleep properly and they will start asking
Aru
Aru•4mo ago
hmmm so i did 3x zram and suspend seems better?
TecN01R
TecN01R•4mo ago
I’ll give it a go
Aru
Aru•4mo ago
oh derp nevermind
TecN01R
TecN01R•4mo ago
A workaround but not really a solution
Aru
Aru•4mo ago
forgot to reboot i was on the old image let me reboot real quick nope nevermind, false alarm time to revert to 6.6 kernel i need to figure out how to install patched gamescope on 6.6, then the LGO will fully setup
TecN01R
TecN01R•4mo ago
So with 6.6, no unified frame fix?
Aru
Aru•4mo ago
yep, pretty much the only missing piece from 6.6 is the gamescope fix
sudo rpm-ostree override replace --experimental --from repo=copr:copr.fedorainfracloud.org:kylegospo:bazzite gamescope-session-plus
sudo rpm-ostree override replace --experimental --from repo=copr:copr.fedorainfracloud.org:kylegospo:bazzite gamescope-session-plus
thought this would work, but apparently not
1/4 Life
1/4 Life•4mo ago
iirc it's just gamescope-session try that but also, patch is in gamescope so you want kylegospo:bazzite-multilib gamescope make sure the repo is enabled in /etc/yum.repos.d
TecN01R
TecN01R•4mo ago
Once you figure out the final command let me know so I can rollback and use it too
Aru
Aru•4mo ago
so to enable the repo, do I basically just need to remove the _ from the _copr_kylegospo-bazzite-multilib.repo file in /etc/yum.repos.d?
1/4 Life
1/4 Life•4mo ago
inside of it there's just gonna be an enabled=0 set it to 1
Aru
Aru•4mo ago
ah ok, that makes more sense
1/4 Life
1/4 Life•4mo ago
we do that because checking the repos is meaningless on an OCI so it's wasted time during update if they're enabled but I wanted them present for use cases such as yours right now
TecN01R
TecN01R•4mo ago
I'll just catch the explainer video on YouTube 😂
Aru
Aru•4mo ago
🤔
sudo rpm-ostree override replace --experimental --from repo=copr:copr.fedorainfracloud.org:kylegospo:bazzite-multilib gamescope
Inactive base replacements:
gamescope
Checking out tree 217f864... done
Resolving dependencies... done
error: No packages in transaction
sudo rpm-ostree override replace --experimental --from repo=copr:copr.fedorainfracloud.org:kylegospo:bazzite-multilib gamescope
Inactive base replacements:
gamescope
Checking out tree 217f864... done
Resolving dependencies... done
error: No packages in transaction
everything I did so far: sudoedit /etc/yum.repos.d/_copr_kylegospo-bazzite-multilib.repo change enabled=0 to enabled=1
sudo rpm-ostree cleanup -m
sudo rpm-ostree refresh-md
sudo rpm-ostree cleanup -m
sudo rpm-ostree refresh-md
rebooted I must be overlooking something
1/4 Life
1/4 Life•4mo ago
sec rpm-ostree override replace https://download.copr.fedorainfracloud.org/results/kylegospo/bazzite-multilib/fedora-39-x86_64/07007152-gamescope/gamescope-3.13.19-1.fc39.bazzite.0.0.git.2716.354b9dd4.x86_64.rpm Try that
Aru
Aru•4mo ago
Aru
Aru•4mo ago
hrm, well i'll keep trying to figure out. but so far, looks like this might be trickier than expected. i'll try manually downloadign the rpm ok, did a sudo rpm-ostree reset, seemed to successfully run afterwards. rebooting now nope, let me try again with gamescope and gamescope-libs
sudo rpm-ostree override replace --experimental --from repo=copr:copr.fedorainfracloud.org:kylegospo:bazzite-multilib gamescope-libs gamescope
Inactive base replacements:
gamescope-libs
gamescope
Staging deployment... done
Freed: 91.2 MB (pkgcache branches: 2)
Run "systemctl reboot" to start a reboot
sudo rpm-ostree override replace --experimental --from repo=copr:copr.fedorainfracloud.org:kylegospo:bazzite-multilib gamescope-libs gamescope
Inactive base replacements:
gamescope-libs
gamescope
Staging deployment... done
Freed: 91.2 MB (pkgcache branches: 2)
Run "systemctl reboot" to start a reboot
nope, no go eh, i'll give up for now. i was able to get by fine without the unified slider before i'll just revert to using the old slider
TecN01R
TecN01R•4mo ago
Is 20240205 the last image before the kernel update?
Aru
Aru•4mo ago
yep, pretty much you can also check specific kernel versions for images, i documented it here: https://github.com/aarron-lee/legion-go-tricks?tab=readme-ov-file#roll-back-to-bazzite-image-with-specific-linux-kernel
Tayler
Tayler•4mo ago
6.7 is messing with suspend on my alienware laptop, also In my case, I believe it's this issue, might be the same for you: https://bugzilla.redhat.com/show_bug.cgi?id=2262577 They say a fix is coming in 6.7.5
Aru
Aru•4mo ago
Looking at that bug report, it seems different from what I'm seeing on the LGO. Might be unrelated, but we'll have to see with 6.7.5 hrm, now that i know that steam temporarily caches old refresh rates, etc, i wonder if this had actually worked but steam was still showing stale data guess i'll try again and find out one more followup to note about kernel 6.7. I've noticed that sometimes, while suspending, the device will hang and freeze up. i'll need to force a reboot via holding the power button still haven't seen the issue yet on kernel 6.6, so i've been sticking with the 02-05 image
antheas
antheas•4mo ago
Can replicate on 6.6 after 7 sleep attempts back to back Or 10 A lot of the time steam gets confused and gets a black screen But if I press the power button again it winks after a few sec I was testing the flashing issue
Aru
Aru•4mo ago
hrm, actually that'd make sense for why i see the issue on 6.7 but not 6.6 because on 6.7, i often need to press the power button multiple times for it to actually suspend
antheas
antheas•4mo ago
But they have to be back to back Probably
Aru
Aru•4mo ago
whereas on 6.6 it suspends first tap every time
antheas
antheas•4mo ago
If you could get some journalctl -n 0 logs that would be great When that happens It usually says why it failed Not my first rodeo
Aru
Aru•4mo ago
hrm, guess i'll have to run the latest bazzite image and replicate then. since i've just been cruising on 6.6
Solution
Aru
Aru•4mo ago
looks like this might be resolved with 6.7.5, see #Legion Go Suspend issue for updates