Wednesday, May 29, 2013

The futility of catching std::bad_alloc

After debating the merits of testing for NULL after calling the operator new (see this post), the first reaction is to start coding like this:

   try {
      char* p = new char[256];
      ...many lines of code
   }
   catch( std::bad_alloc& b ) {
      // error handling
   }

For most applications out there, this is just clutter. It makes the code more difficult to read without providing any benefit to justify the added lines of code (disclaimer: I understand that for some applications careful memory management is important - on the other hand, these type of applications will likely not use the global operator new).

It will also make the code bigger. Everything inside the catch block will generate object code, but will rarely, if ever, be executed. 

And, if it is executed, what exactly are we supposed to do inside the catch block?

It will rarely, if ever, be executed

Modern operating systems will allocate lots of virtual memory to a process before memory allocation fails. This is a test program that asks for 4 GB of memory (the size of physical memory on the machine I tested it). 
Side note: The strcpy in the code is just to use the variable p for something. Without that the compiler may optimize away the (unused) variable, which then skips the memory allocation altogether.
 #include <new>  
 #include <iostream>  
 #include <string.h>  
   
 using namespace std;  
   
 int main(int argc, const char* argv[]) {  
   for( int i = 0; i < 4000; i++ ) {  
    try {  
      cout << "Allocated " << i+1 << " MB so far" << endl;  
      char* p = new char[1024 * 1024];  
      strcpy( p, "test" );  
    } catch (std::bad_alloc& b) {  
      cout << "Finally failed" << endl;  
      break;  
    }  
   }  
   
   cout << "Press enter to exit";  
   cin.ignore();  
 }  


And here is the result: the operating system (OS X 10.10 in this case) gladly gave 4 GB of memory to the program without ever throwing std::bad_alloc:

   ...
   Allocated 3998 MB so far
   Allocated 3999 MB so far
   Allocated 4000 MB so far

   Press enter to exit

And the operating system reports most of the memory as virtual memory.



Once a process gets to that point, paging will be a serious problem. The process will run slowly, to the point of being useless (and very likely affect other processes on that machine).

The exact behavior varies by operating system and the configuration of the machine. This is what happens on Ubuntu running under VirtualBox, configured with 2 GB of base memory:

   ...
   Allocated 3053 MB so far
   Allocated 3054 MB so far
   Allocated 3055 MB so far
   Finally failed
   Press enter to exit

Although in this case the operating system did eventually return an error, the point is the same: it will give gobs of memory to the process before it fails.

What exactly are we supposed to do inside the catch block?

If we do eventually get a std::bad_alloc, what exactly can or should do inside the catch block?

The first answer is usually to log something and return an error:

   catch( std::bad_alloc& b ) {
      ...log error
      return ERR_FAILED_TO_ALLOCATE_MEMORY;
   }

There are a few problems with that:


  • If logging an error requires memory (e.g. allocate buffers for the logging system), it will also fail and throw std::bad_alloc.
  • What exactly is the caller function supposed to do with ERR_FAILED_TO_ALLOCATE_MEMORY? We are just passing the buck up the chain.


It is hard to find anything useful or guaranteed to work to put inside the catch block.

In most cases, it is a better alternative to let the program die and restart it in a clean state. Everything else will most likely result in a program that is limping along, paging most of the time, instead of doing useful work.

What about a new-handler function?

This is the point where a well-informed C++ programmer will ask if we can do anything useful by installing a new-handler.

But it again begs the question: what is "useful" under these circumstances. The new-handler can release some memory back to the system to allow the new operator to work.

There are again problems with that:

  • Where does the memory come from? It is probably from a stash the program saved earlier on. The stash cannot be all that big or it will just make the problem happen sooner.
  • How does it is (really) solve the problem? If we have a real leak, release a stash of memory we saved earlier will just allow the program to run a little longer and fail yet again.
If there is anything useful to be done in a new-handler is to nicely document the problem we had so we can more easily debug it later.

This usually means two things:
  • Log a clear error message. Logging the message itself may require some memory, so we need to save a small amount of memory and release it later to ensure our logging functions work.
  • Generate a core dump file for systems that support it. This one can be tricky because the core dump will be very large (it has the entire memory footprint for the process, which by now must be large, since we just ran out of memory).
Here is a sample program that installs a new-handler, reserving a bit of memory to ensure the log works and generates a core dump by calling abort():

 #include <new>  
 #include <iostream>  
 #include <string.h>  
 #include <stdlib.h>  
   
 using namespace std;  
   
 static char* p = new char[1024*10];  
 void new_failed() {  
   // Make sure we have some memory to work with in this function  
   delete[] p;  
   // Log a clear message about the problem we hit  
   cout << "Ran out of memory" << endl;  
   // Will generate a core dump in some systems - useful to debug later  
   // ...but be careful because it will likely be a huge file because of  
   // the amount of memory allocated so far  
   abort();  
 }  
   
 int main(int argc, const char* argv[]) {  
   
   set_new_handler(new_failed);  
   
   for( int i = 0; i < 4000; i++ ) {  
    try {  
      cout << "Allocated " << i+1 << " MB so far" << endl;  
      char* p = new char[1024 * 1024];  
      strcpy( p, "test" );  
    } catch (std::bad_alloc& b) {  
      cout << "Finally failed" << endl;  
      break;  
    }  
   }  
   
   cout << "Press enter to exit";  
   cin.ignore();  
 }  
   


This is what happens when it fails:

   ...
   Allocated 3053 MB so far
   Allocated 3054 MB so far
   Allocated 3055 MB so far
   Ran out of memory

   Aborted (core dumped)


Tools for memory leak detection

A process that runs out of memory has a memory leak in 99.999% of the cases.

There are many great tools out there to find memory leaks. My favorite is valgrind (side note: here is how to pronounce the name of that tool).

If you are developing in a Mac with XCode, run you project with ProductProfile and choose the Leaks option and Memory.

4 comments:

  1. FYI your example code is invalid, as you can't strcat to memory returned by operator new. If that works then it's a coincidence. On most platforms that will crash. I suggest strcpy instead.

    ReplyDelete
    Replies
    1. Would you mind giving a reference to where it says that strcat can't be used with memory allocated with the operator new, but strcpy works?

      To the best of my knowledge, the global operator new behaves like malloc() for primitive data types (the char array in this case). With that in mind, all strxxx functions should work fine.

      Or do you mean that the memory wasn't initialized, therefore strcat can't be used? That's a problem in this code. I changed it to strcpy to avoid the uninitialized memory error.

      If that's what you meant to point out, thanks for finding and reporting it.

      Delete
  2. This comment has been removed by the author.

    ReplyDelete
  3. Some details that do not undermine your message:
    - new, malloc, etc do not allocate memory, they allocate address space nowadays. Memory is allocated as-needed. So you're only using at most 4K per allocation (one page, and that's if the compiler didn't drop the strcpy), and the limit you're hitting is a combination of address space size and user limits. E.g. the ubuntu is probably in 32-bits mode, which technically gives you a 3G limit due ot the way the memory is laid out. While osx is in 64 bits mode, where the limit is insane.
    - memory being allocated as-needed, there's no mechanism to tell the application when memory actually runs out. It gets killed hard without recourse by the appropriately named "Ouf-Of-Memory Killer"
    - given how the (lack of) limit works, getting such an exception means different things than 20 years ago. It means that either you're in 32 bits and you're actually out of space (but that's getting rare thankfully), or more probably the requested size is wrong. Which often means a value coming from corrupted external data (either intentionally or not) or broken computations involving small negative values turning unsigned. Both very interesting cases, but not out-of-memory problems.

    OG.

    ReplyDelete