Improving on .NET Memory Management for Large Objects - CodeProject


As documented elsewhere, .NET memory management consists of two heaps: the Small Object Heap (SOH) for most situations, and the Large Object Heap (LOH) for objects around 80 KB or larger.

The SOH is garbage collected and compacted in clever ways I won't go into here.

The LOH, on the other hand, is garbage collected, but not compacted. If you're working with large objects, then listen up! Since the LOH is not compacted, you'll find that either:

  1. you're running an x86 program and you get OutOfMemory exceptions, or
  2. you're running x64, and memory usage becomes unacceptably high. Welcome to the world of a fragmented heap, just like the good old C++ days.

So where does this leave us? Let's code our way out of this jam!

If you work with streams much, you know that MemoryStream uses an internal byte array and wraps it in Stream clothing. Memory shenanigans made easy! However, if a MemoryStream's buffer gets big, that buffer ends up on the LOH, and you're in trouble. So let's improve on things.

Our first attempt was to create a Stream-like class that used a MemoryStream up to 64 KB, then switched to a FileStream with a temp file after that. Sadly, disk I/O killed the throughput of our application. And just letting MemoryStreams grow and grow caused unacceptably high memory usage.

So let's make a new Stream-derived class, and instead of one internal byte array, let's go with a list of byte arrays, none large enough to end up on the LOH. Simple enough. And let's keep a global ConcurrentQueue of these little byte arrays for our own buffer recycling scheme.

MemoryStreams are really useful for their dual role of stream and buffer so you can do things like...

string str = Encoding.UTF8.GetString(memStream.GetBuffer(), 0, (int)memStream.Length);

So let's also work with MemoryStreams, and let's keep another ConcurrentQueue of these streams for our own recycling scheme. When a stream to recycle is too big, let's chop it down before enqueuing it. As long as the streams stay under 8X of little buffer size, we just let it ride, and the folks requesting streams get something with Capacity between the little buffer size and 8X the little buffer size. If a stream ends up on the LOH, we pull it back to the SOH when it gets recycled.

Finally, for x86 apps, you should have a timer run garbage collection with LOH defragmentation like so:

GCSettings.LargeObjectHeapCompactionMode = GCLargeObjectHeapCompactionMode.CompactOnce;

We run this once a minute for one of our x86 applications. It only takes a few milliseconds to run, and, in conjunction with buffer and stream recycling, memory usage stays manageable.

Check out the attached class library for the Stream-derived class, BcMemoryStream, and the overall memory manager, BcMemoryMgr. There's a third class, an IDisposable class call MemoryStreamUse, which manages recycling of MemoryStreams.

You just tell BcMemoryMgr how big you want the little buffers to be and how many buffers and MemoryStreams to recycle. A buffer size of 8 KB is good because 8 X 8 KB -> 64 KB max size for MemoryStreams, which is under the 80 KB LOH threshold. You have to make peace with the max number of objects to keep in the recycling queues. For an x86 program and 8 KB buffers and heavy MemoryStream use, you might not want to allow a worst case of 8 X 8 KB X 10,000 -> 640 MB to get locked up in this system. With a maximum queue length of 1,000, you're only committing to a max of 64 MB, which seems a low price to pay for buffer and stream recycling. To be clear, the memory is not pre-allocated; the max count is just how large the recycling ConcurrentQueues can get before buffers and streams are let go for normal garbage collection.

Let's look at each class in detail.

Let's start with BcMemoryMgr. It's a small static class. It has ConcurrentQueues for recycling buffers and streams, the buffer size, and the max queue length.

Looking at the member functions, you Init with the buffer size and max queue length, and it calls a SelfTest routine that tests the class library. If the tests don't pass, the code doesn't run...poor man's unit testing. Note that you can specify a zero max queue length, in which case you'll get no recycling, just normal garbage collection. There are buffer functions AllocBuffer and FreeBuffer ... anybody remember malloc and free? There are stream functions AllocStream and FreeStream. FreeStream chops the stream down if it's too big before enqueuing it for reuse.

BcMemoryStream is Stream-derived and implements pretty much the same interface as MemoryStream. One notable exception is that you cannot set the Capacity property. Instead, there is a Reset function you can call to free all buffers in the class, returning the Capacity to zero. The fun code is in Read and Write; Buffer.BulkCopy came in handy. There are extension routines for working with strings. These were handy when writing BcMemoryMgr's SelfTest routine.

BcMemoryStreamUse is a small IDisposable class uses BcMemoryMgr to allocate a MemoryStream in its constructor and free it in its Dispose method. Stream recycling is fun and easy!

BcMemoryBufferUser is similar to BcMemoryStream, just for byte arrays. Buffer recycling is fun and easy!

BcMemoryTest puts BcMemoryStream and MemoryStream up to a test, hashing all files in a directory and its subdirectories. For each file, it starts with a FileStream, then CopyTo's into either MemoryStream or BcMemoryTest, the does the hashing. In our tests, we use a leafy and varied test directory with lots of built binaries. The performance for the total run time for BcMemoryStream was about 25% faster than MemoryStream. So not only did we solve the LOH problem, we get a better end result. Hooray!

Finally, there's a little config file addition that is absolutely necessary for server applications:

<?xml version="1.0" encoding="utf-8"?>
        <gcServer enabled="true"/>

This gives you background garbage collection, a heap per processor, etc. A real lifesaver.

Hope this helps!