j_g
November 19th, 2007, 12:45 PM
This thread is about Why Linux has an "OOM killer" and what programmers can do to help get rid of it. If someone thinks it's about something else, he's wrong.
I'm going to try to explain this for the layman. That's what I like to do. If anyone thinks that the explanation sounds too simplistic, or omits lots of technical details, so be it. I just want it to be understandable to people who may not ordinarily be able to follow a discussion like this.
Linux has what some people have dubbed "an OOM Killer". The "OOM" stands for "Out Of Memory". What is Linux's "OOM Killer"? It's part of the Linux operating system that springs into action when there's a really bad problem with some app trying to allocate/use memory. The bad problem is that there isn't enough free memory to satisfy how much memory the app now wants to allocate/use. And all the swap space on your hard drive has also been already filled up with previously swapped-out stuff. So, the operating system can't even free up some RAM by moving stuff out of RAM and into the swap space (usually called "virtual memory") on your hard drive.
In short, your system is completely out of free RAM, as well as free swap space. It's all used up... but now some app wants more RAM.
So, the Linux operating system (OOM Killer) selects some software currently running on your system, and abruptly terminates it, hoping that this will free up some RAM. There are a bunch of "rules" that the OOM Killer uses to pick which software is going to get killed. But it doesn't ask you, the enduser, which software you would like killed. (Nor does it tell you that the system is low on memory, and give you a chance to try some things yourself to free up RAM. For example, maybe you'd close some unneeded windows, or choose which software programs to terminate, or disconnect USB devices to see if that frees up RAM, or stop services/daemons to see if that frees up RAM, etc). No, Linux makes the choice for you, and that's that. Maybe it terminates a running copy of Open Office with a currently unsaved document you've been working on for the past two hours. Maybe. You never know. Obviously, it's a bad thing that the OS never gives you an opportunity to try and recover some RAM yourself, but instead terminates some software and makes the choice of which software for you. (P.S. Windows doesn't do this. Some other Unix derivatives, such as Solaris, don't either. I don't think the BSD's do. Not sure. Oh, and Mac OS is like Windows. Warns you about low memory, and lets you take action).
Why in hell does Linux do this? Well, we need to learn about some other aspects of Linux.
There is a function named fork(). It allows a program to essentially make another running copy of itself, except the new copy doesn't start back at main() -- it starts after the call to fork() which is usually some bit of code that only the second copy is meant to do. Sometimes, programmers make that second bit of code do something trivial, which shouldn't require a lot of memory, nor copies of all the global variables/data in the program. Nevertheless, Linux makes copies of all those global variables/data, just in case the second bit of code may use them. Linux doesn't know whether the code will or won't. But fork() is designed so that the bit of code may do that... so it must be supported. So, if the program has, for example:
char MyBuffer[60000];
... this means the second running copy of the program has access to its own 60000 byte copy of MyBuffer, even if that bit of code never even needs it at all.
That could be an awful lot of wasted RAM if the bit of code doesn't actually use it. So Linux doesn't really allocate actual RAM. It sort of says "I won't make actual RAM copies of that global stuff. But if that bit of code tries to access MyBuffer, the CPU's MMU will tell me about it, and I'll actually allocate RAM for the second copy right then and there. The app doesn't even need to know it happened. All the app has to do is try to write to MyBuffer, and bang, there's a new copy of it actually in RAM."
Ok, so Linux has delivered an IOU to the app for 60000 bytes right there.
Now there may be other forked apps as well, to which Linux has also issued outstanding IOU's for RAM.
Now, let's talk about what malloc() does. Let's say you do:
char *ptr;
ptr = malloc(60000);
Have you allocated 60000 bytes? Yes and no. You think you have. The operating system may tell you that you have. (ie, malloc may not return 0). But what Linux has done is write out another IOU for 60000 bytes. It hasn't actually allocated the RAM yet. Linux is going to wait until you actually write something to the buffer before it allocates RAM. So as soon as you do this:
ptr[0] = 0;
... then the CPU's MMU kicks Linux in the butt and says "Someone just cashed in your IOU. You told him he had 60000 bytes, and now he wants (some of) it. Give it to him". And so Linux looks for some free RAM to fullfill its IOU. It finds some, gives it to the MMU, and then your above instruction works, and you're now writing to actual, real RAM. (Well, that assumes everything goes well, which it may not on Linux).
So, everytime some app calls malloc(), Linux writes out an IOW, to be cashed in at any time by the app.
Now, this would be ok if Linux made sure that all its IOU's are covered by the amount of available RAM in the system, and the amount of available swap space. But it doesn't. Why?
Because there are too many Linux programmers who call fork() just to run a trivial bit of code. And as we've seen from "real experts, with known names and project" who say things like "spending much time thinking about handling malloc() returning NULL is generally a waste of it, obviously there are lots of Linux programmers who take a lax attitude toward memory allocation. For example, some of them actually believe that malloc will never return a 0 on Linux, so they don't even check for that. Too many Linux programs ask malloc to write them out an IOU for lots and lots and lots more memory than they'll ever use (because after all, it isn't allocated until you use it, so the hell with being reasonable with your request), and they leak memory all over because they're totally careless and nonchalant in their error handling.
So Linux makes the assumption that all Linux software is badly written when it comes to memory use. Linux assumes that a program will fork() to run a trivial bit of code that doesn't access all its global vars/data, and that a Linux program will ask for more memory than it will ever actually use, and it will do crummy error handling such as not bother checking if malloc returns a 0 (so Linux will try to, but not always, return something other than 0 when it really shouldn't). With these assumptions, Linux therefore gives out more IOUs than it has the capacity to fullfill. Linux hopes that not all those IOUs will be called in. This is called "over-committing". In other words, Linux makes more promises than it can keep (because the IOUs are for more than the available free RAM and available swap space).
So what happens if apps suddenly do start calling in enough of those IOUs such that, finally Linux realizes "OMG! I've already given out all the available free RAM. And I also already filled up the swap space, by writing out stuff in order to free up more RAM. What can I do? I already promised the app some RAM, and now that app is writing to that RAM at this very moment". That's when the OOM Killer kicks in. Linux picks out some running software to abruptly terminate, and grabs back its RAM. "Ack!", says Bill the Cat, as he coughs up a furball.
So what can you, as a programmer, do to help this situation?
1) Do proper error checking. Check for a malloc 0 return. Don't leak memory.
2) Don't ask for ridiculous amounts of RAM that you're not necessarily going to use, just because malloc allows you to ask for a lot of RAM without necessarily allocating it all at once. Ask for RAM in sensible increments.
3) Don't fork() to run trivial code, especially in a program with lots of resources that the secondary bit of code doesn't need. Try to use more versatile techniques such as pthreads.
4) Tell kernel writers we need functions like Win32's VirtualAlloc, so the OS can inform us of a failure without ugly signal handling. Give apps more direct, simple control of the fullfilling of those IOUs, so we can better recover from a failure to fullfill, and the OOM Killer won't have to kick in.
5) Ask kernel and GUI developers to try to work together to come up with a system like Windows, where an enduser is notified of a low memory condition, and given the option to manually attempt to free up RAM in a way he prefers.
6) Get the word out to other developers about doing the above.
7) If you think that moderators are allowing discussion to be stifled/side-tracked, let them know.
I'm going to try to explain this for the layman. That's what I like to do. If anyone thinks that the explanation sounds too simplistic, or omits lots of technical details, so be it. I just want it to be understandable to people who may not ordinarily be able to follow a discussion like this.
Linux has what some people have dubbed "an OOM Killer". The "OOM" stands for "Out Of Memory". What is Linux's "OOM Killer"? It's part of the Linux operating system that springs into action when there's a really bad problem with some app trying to allocate/use memory. The bad problem is that there isn't enough free memory to satisfy how much memory the app now wants to allocate/use. And all the swap space on your hard drive has also been already filled up with previously swapped-out stuff. So, the operating system can't even free up some RAM by moving stuff out of RAM and into the swap space (usually called "virtual memory") on your hard drive.
In short, your system is completely out of free RAM, as well as free swap space. It's all used up... but now some app wants more RAM.
So, the Linux operating system (OOM Killer) selects some software currently running on your system, and abruptly terminates it, hoping that this will free up some RAM. There are a bunch of "rules" that the OOM Killer uses to pick which software is going to get killed. But it doesn't ask you, the enduser, which software you would like killed. (Nor does it tell you that the system is low on memory, and give you a chance to try some things yourself to free up RAM. For example, maybe you'd close some unneeded windows, or choose which software programs to terminate, or disconnect USB devices to see if that frees up RAM, or stop services/daemons to see if that frees up RAM, etc). No, Linux makes the choice for you, and that's that. Maybe it terminates a running copy of Open Office with a currently unsaved document you've been working on for the past two hours. Maybe. You never know. Obviously, it's a bad thing that the OS never gives you an opportunity to try and recover some RAM yourself, but instead terminates some software and makes the choice of which software for you. (P.S. Windows doesn't do this. Some other Unix derivatives, such as Solaris, don't either. I don't think the BSD's do. Not sure. Oh, and Mac OS is like Windows. Warns you about low memory, and lets you take action).
Why in hell does Linux do this? Well, we need to learn about some other aspects of Linux.
There is a function named fork(). It allows a program to essentially make another running copy of itself, except the new copy doesn't start back at main() -- it starts after the call to fork() which is usually some bit of code that only the second copy is meant to do. Sometimes, programmers make that second bit of code do something trivial, which shouldn't require a lot of memory, nor copies of all the global variables/data in the program. Nevertheless, Linux makes copies of all those global variables/data, just in case the second bit of code may use them. Linux doesn't know whether the code will or won't. But fork() is designed so that the bit of code may do that... so it must be supported. So, if the program has, for example:
char MyBuffer[60000];
... this means the second running copy of the program has access to its own 60000 byte copy of MyBuffer, even if that bit of code never even needs it at all.
That could be an awful lot of wasted RAM if the bit of code doesn't actually use it. So Linux doesn't really allocate actual RAM. It sort of says "I won't make actual RAM copies of that global stuff. But if that bit of code tries to access MyBuffer, the CPU's MMU will tell me about it, and I'll actually allocate RAM for the second copy right then and there. The app doesn't even need to know it happened. All the app has to do is try to write to MyBuffer, and bang, there's a new copy of it actually in RAM."
Ok, so Linux has delivered an IOU to the app for 60000 bytes right there.
Now there may be other forked apps as well, to which Linux has also issued outstanding IOU's for RAM.
Now, let's talk about what malloc() does. Let's say you do:
char *ptr;
ptr = malloc(60000);
Have you allocated 60000 bytes? Yes and no. You think you have. The operating system may tell you that you have. (ie, malloc may not return 0). But what Linux has done is write out another IOU for 60000 bytes. It hasn't actually allocated the RAM yet. Linux is going to wait until you actually write something to the buffer before it allocates RAM. So as soon as you do this:
ptr[0] = 0;
... then the CPU's MMU kicks Linux in the butt and says "Someone just cashed in your IOU. You told him he had 60000 bytes, and now he wants (some of) it. Give it to him". And so Linux looks for some free RAM to fullfill its IOU. It finds some, gives it to the MMU, and then your above instruction works, and you're now writing to actual, real RAM. (Well, that assumes everything goes well, which it may not on Linux).
So, everytime some app calls malloc(), Linux writes out an IOW, to be cashed in at any time by the app.
Now, this would be ok if Linux made sure that all its IOU's are covered by the amount of available RAM in the system, and the amount of available swap space. But it doesn't. Why?
Because there are too many Linux programmers who call fork() just to run a trivial bit of code. And as we've seen from "real experts, with known names and project" who say things like "spending much time thinking about handling malloc() returning NULL is generally a waste of it, obviously there are lots of Linux programmers who take a lax attitude toward memory allocation. For example, some of them actually believe that malloc will never return a 0 on Linux, so they don't even check for that. Too many Linux programs ask malloc to write them out an IOU for lots and lots and lots more memory than they'll ever use (because after all, it isn't allocated until you use it, so the hell with being reasonable with your request), and they leak memory all over because they're totally careless and nonchalant in their error handling.
So Linux makes the assumption that all Linux software is badly written when it comes to memory use. Linux assumes that a program will fork() to run a trivial bit of code that doesn't access all its global vars/data, and that a Linux program will ask for more memory than it will ever actually use, and it will do crummy error handling such as not bother checking if malloc returns a 0 (so Linux will try to, but not always, return something other than 0 when it really shouldn't). With these assumptions, Linux therefore gives out more IOUs than it has the capacity to fullfill. Linux hopes that not all those IOUs will be called in. This is called "over-committing". In other words, Linux makes more promises than it can keep (because the IOUs are for more than the available free RAM and available swap space).
So what happens if apps suddenly do start calling in enough of those IOUs such that, finally Linux realizes "OMG! I've already given out all the available free RAM. And I also already filled up the swap space, by writing out stuff in order to free up more RAM. What can I do? I already promised the app some RAM, and now that app is writing to that RAM at this very moment". That's when the OOM Killer kicks in. Linux picks out some running software to abruptly terminate, and grabs back its RAM. "Ack!", says Bill the Cat, as he coughs up a furball.
So what can you, as a programmer, do to help this situation?
1) Do proper error checking. Check for a malloc 0 return. Don't leak memory.
2) Don't ask for ridiculous amounts of RAM that you're not necessarily going to use, just because malloc allows you to ask for a lot of RAM without necessarily allocating it all at once. Ask for RAM in sensible increments.
3) Don't fork() to run trivial code, especially in a program with lots of resources that the secondary bit of code doesn't need. Try to use more versatile techniques such as pthreads.
4) Tell kernel writers we need functions like Win32's VirtualAlloc, so the OS can inform us of a failure without ugly signal handling. Give apps more direct, simple control of the fullfilling of those IOUs, so we can better recover from a failure to fullfill, and the OOM Killer won't have to kick in.
5) Ask kernel and GUI developers to try to work together to come up with a system like Windows, where an enduser is notified of a low memory condition, and given the option to manually attempt to free up RAM in a way he prefers.
6) Get the word out to other developers about doing the above.
7) If you think that moderators are allowing discussion to be stifled/side-tracked, let them know.