Out of memory crash
While doing normal text editor / web browser / Discord / Steam work, the hard drive light suddenly turned on and wouldn't ever turn off. The keyboard and mouse immediately became unresponsive, and not even ctrl-alt-f3 would work.
Glancing at logs after a hard reset, it appears unsurprisingly to be due to running out of memory.
Is there a way to adjust the oom killer to have a larger margin? If it kicked in at all here, it appears to have been way too late to avoid a hard reset.
11 Replies
Another system crash just happened, same symptoms but no hard drive light this time
Even with the SysRq key enabled I couldn't trigger anything, and both screens were frozen
still on Bazzite 42, nothing changed in the system
in terms of software that I know of
ujust showed no messages at all after the crash, and nothing weird before it
Like a total system freeze
Unfortunately the OOM daemon seems to be very unconfigurable. This was more for a work machine than a gaming machine but I found earlyoom to help a lot with this. I cover how to use it in this help thread https://discord.com/channels/1072614816579063828/1425838708485001356/1425854938587730075
Thanks for the link
If journalctl doesn't have any messages am I out of lucky identifying a previous system freeze cause?
It could also be SSD drive health issues. If the drive glitches Linux can't write to the log file. Have you checked your smart data for the drive. You can just look up "SMART" in the menu (if you on KDE)
I will note that I had manually enabled sysrq using sysctl, and it didn't respond. I'll also note it isn't a case of waiting 2 minutes, in my previous freeze where the hard drive light was constantly on and I saw out of memory messages, I waited like 40 minutes with no change.
(your thread is great, I'm just saying I'm afraid my problem might be slightly different)
I also saw someone in #bazzite mention they changed RAM sticks and their system was fine, yet memtest never reported errors on their previous stick. It's hard to say if that's a bug or a hardware problem.
don't have a SSD, brand new hard drive
All previous crashes have written to the drive so it seems unlikely but maybe a new lemon
Yup your problem might be different. Difficult to tell when there is no log
Thanks so much for your time.
I'm currently trying to see if my enabled sysrq magic keys are actually working
The only safe one to use in a running system seems to be the log level
This should enable control of console logging level
I wonder if it's not working because printscreen is triggering instead. I'm on a normal-ish split keyboard, no function modifier available
I haven't tried the sysrq in Bazzite. Last system I used them in was PopOS from a few years back
I got it to work! But only once, I see the printk log level changed from 3 to 5
But I can't get it to happen again. If I use sysrq or alt-sysrq, printscreen triggers KDE instead
I have seem Bazzite become unresponsive while it's doing the last part of it's update (especially the last step is quite disk heavy). Did you perhaps encounter the issue around 20-30 minutes after starting up the machine. I think that's when the auto updates trigger. If you run
rpm-ostree status are you on the latest version?auto updates are paused
wanted to avoid issues due to nVidia on 43 right after release, wanted control over when the update happened
I haven't tried manually updating yet to see if the issues are resolved on this hardware
Solution for SysRq: Once it's enabled, I have to do this sequence
* Press left alt or right alt
* Press SysRq
* While holding both, press 3 (for example, to change loglevel)
* while holding 3, release SysRq
* Release all remaining keys
Only if SysRq is released while the magic key (e.g. 3) is still pressed, does it seem to work
Next time I see a freeze, I plan to attempt the 'r' command to gain raw keyboard control, and if that doesn't work for a console, attempt 'f' to see if it triggers OOM
The freeze happened, but SysRq was unresponsive along with everything else. System Monitor showed 50% memory used, and there was nothing in the system log, so I doubt just running out of memory caused it this time.