[kwlug-disc] balancing CPU and memory for highly variable memory needs

Sun Oct 20 01:16:51 EDT 2024

The Problem:
------------

Compilers these days use a lot of memory.  Compiling straight C isn't bad
at all, except in extreme corner cases, but compiling C++ and Rust
can easily take you into multi-gig scenarios.

For example, I've recently taken to compiling my own copy of Firefox.
I need to run this on a system with plenty of RAM, since a single compiler
on a single source file can take up to 10G depending on the code it's
working on, and some regularly need 5 to 8G.

Of course, this memory usage is very temporary.  If you have the memory,
it will be done compiling in about a minute or three, leaving all that
free RAM for the next compiler process.

My "big" system has 22G of RAM, and 8 threads on the CPU.  In theory
I should be able to run "make -j8" or even "make -j10" and be happy.
But I will only be happy if the build just happens to free that 10G
of RAM in time for a competing process to use it.

I have not been lucky all the time. :-)  Recently I was trying to get
something done with a few virtual machines going too, and the only
safe way to finish the compile without massive swapping was -j1.   Ouch.

The lack of RAM can lead to a lot of wasted CPU resources, even when
there are many other compile runs that only need 1 or 2G.

The Solution:
-------------

It occurred to me that I could create a special library of malloc related
functions loaded with LD_PRELOAD, which would communicate with a central
server to manage overall memory.  Each request to allocate memory would
send a message to the server with the process ID and size needed,
and wait until permission was given before returning it.

This would allow for some interesting memory management algorithm testing,
and even some fun visuals as I watched memory requests and usage
across all hooked processes.  The central server could even allow for
human intervention and levels to be set on the fly.

The obvious first algorithm would be to reserve a minimum level for
the maximum known compiler need, and only allow one process at a time
to climb the 10G ladder, while all remaining processes that only
need 1 or 2G to finish could continue.  With 15G available, that would
maybe allow for a make -j6 safely, without overflowing or hitting swap
too hard.

The Question:
-------------

Has anyone ever seen anything like this in the wild?  Has this been done
before?  Has it been done better than I've described it here?

I sure hope so... otherwise, it's yet another idea on my overflowing
todo list. :-)

Thanks,
- Chris