• 0 Posts
  • 15 Comments
Joined 1 year ago
cake
Cake day: June 6th, 2025

help-circle
  • LRU inversion: the problem with not caring about it is that it’s not a visible problem until it very suddenly is. Your system will not gradually degrade but very suddenly and unpredictably hit a wall that it cannot get itself over.

    All this talk just confirms my feelings that there is a general lack of understanding of actual modern workloads.

    RAM (normal w/wo zram) doesn’t get full, then stay full forever in real workloads. Not only is that not realistic at the “opened apps”/“running processes” level, it’s not real at the heap allocation level within tasks within processes. And this is much more pronounced with code written in modern languages like Rust and some styles of C++. Modern heap allocators batch and cache (primarily to help with performance). But still, A LOT of memory is getting allocated and deallocated all the time, even from the kernel’s PoV.

    LRU itself is an imperfect approximation, not a goal. In the setup described in my other comment (fast SSD swap storages only used sparingly most of the time), so called LRU inversion gets auto-cancelled relatively quickly, as free space in RAM(+zram) gets available all the time, and some “LRU-hot” pages in SSD swap turn out to be actually cold, and those ones are the only ones that actually stay there.

    This is why, I would imagine a lot of fake scenarios, and “benchmarks” based on them, may fail to replicate the practical reality of many (overall system) use-cases.


    More tangentially, the oversized concern for file caching pages also points to specific aligned use-cases in mind, as if everyone is running DB-centric workloads or something.


  • This is not a good thing btw. Any unused anonymous page takes up space that could instead be used for file-backed pages that make your system faster.

    Can you expand here. I think my attempt at brevity in this part wasn’t helpful.

    Swap is not tiered storage!

    I meant tiered with priorities only, yes.

    Cool tech but it’s dead and was quite niche even when it was alive.

    We are not talking about the original purpose of Optane as supported on Windows. It’s just a (perhaps somewhat outdated) example of a storage device “smaller but faster than your average SSD storage”, which is very much not did tech.

    Not a thing you actually want to use for swap

    Depends on the use-case. But yes, this can also be used as the fastest disk tier/priority of normal swap devices, which is why I mentioned both.

    This makes no sense at all unless you are extremely space-constrained on the NVMe and absolutely must not OOM – even if progress stalls to an absolute crawl.

    Why would you want to see killed processes when you go back to your workstation, in the 1/10000th scenario where something runs amok pushing memory usage to unexpected high levels? When you can simply investigate the reason behind the rare occurrence, then move all the pages off the slowest devices immediately with swapoff?



  • Alright, I will only reply to you, since you raised a fair question.

    First of all, I must admit that I thought what was linked was an earlier similar writing, but the general theme is still the same.

    The problem with the writing is that it focuses on use-cases like Android and some servers, but doesn’t take into account other use-cases. It also seems to come with the assumption that setup is done by the distributor only, or if it’s done by the user, it’s a configure-and-forget situation.

    What he represents is:

    • Limited RAM space
    • Swap will always/often happen (outside of (z)ram)
    • Single tier of non-RAM swap
    • Non-ram swap is significantly slower
    • OOM can be preferable over (outside of ram) swapping
    • Swapped out pages stay where they are until they are required by their process (important).

    Now let’s look at a possible modern workstation setup:

    • Large RAM size
    • Swap is rarely hit, especially if set up with zram.
    • Multiple swap tiers beyond zram/zswap
      • Intel Optane disk used as a super-fast zram write-back device, or a high-priority swap
      • Fast NVME disk used as a second tier swap disk
      • Large HDD swap partition used as a third tier swap disk
    • The biggest consideration is avoiding worst case latency, i.e. hitting HDD swap.
    • Killing processes MUST be avoided, unless exceptional circumstances are hit where the kernel’s OOM would kick in anyway. This holds true even when HDD swap starts getting used.
    • When unusual loads are observed, swapped pages can be moved around by the user (or a tool), by turning swap devices off and on. This is how you can empty the HDD swap partition for example.

    This last point in particular should make it clear why his “imagination” was rather limited in his LRU inversion section.





  • Why do you think 32GiB is special compared to 16GiB?
    And wtf is EasyOOM?

    You maximize the usefulness of zram by actually increasing sappiness, and giving zram devices high priority. e.g.

    sysctl vm.swappiness=100
    
    for i in {1..8}; do
      swapon /dev/zram${i} -p 32767
    done
    

    Then you enable other swap devices with lower priority.

    This is the way regardless of how much RAM you have. I mean, it may be pointless if you never ever exceed, let’s say 10/32GiB (including caching). But it still wouldn’t be harmful in any way.







  • You are in a thread where a user is having a problem because of the push for flatpaks, and because of some distros like Fedora crippling their packages and providing objectively worse alternatives on purpose (because they don’t want to risk RH IBM getting sued). If the user was using some sane community distro like Arch, the user would have never come to realize that such unnecessary issues even exist.

    As for flatpak hate specifically, see my ramblings here.