Why Does Your WebAssembly Linear Memory Still Hit an OOM Even When the Heap Is Half-Empty?

Imagine you’re running a heavy image processing task in the browser via a WebAssembly module. You’ve allocated 512MB of linear memory. According to your internal telemetry, your active objects only occupy about 200MB. Yet, the next time your code calls malloc (or its equivalent in Rust or Zig), the module throws a "RuntimeError: memory access out of bounds" or fails to memory.grow.

You have 312MB of "free" space, but your application just died.

This isn't a bug in the browser's engine. It is the fundamental reality of WebAssembly’s linear memory model. In the world of JavaScript, we are spoiled by moving garbage collectors that compact the heap, sliding objects around to close the gaps. In WebAssembly, memory is a rigid, contiguous array of bytes. Once a block is allocated at index 0x1000, it stays at 0x1000 until it is explicitly freed—and even then, that specific "hole" in memory might be unusable for your next request.

The Contiguity Trap

WebAssembly memory is managed through two primary instructions: memory.size and memory.grow. When your allocator (like dlmalloc or wee_alloc) runs out of "free" space in the current heap, it asks the host environment for more pages (64KB chunks).

The problem is that malloc (or Box::new in Rust) must return a contiguous block of memory. If you ask for 10MB, the allocator must find a 10MB gap. If your heap looks like a piece of Swiss cheese—full of 1MB objects separated by 512KB gaps—you can have 100MB of total free space and still fail to satisfy a 2MB allocation request.

This is External Fragmentation.

In a traditional environment like a JVM or a modern JS engine (V8, Spidermonkey), the Garbage Collector (GC) performs "compaction." It pauses execution, moves live objects next to each other, and updates all pointers to point to the new locations. WebAssembly linear memory cannot do this. The pointers used inside a Wasm module are just integer offsets into the linear memory. If the runtime moved an object from offset 100 to offset 50, it would have no way of knowing which integers in your program’s registers or stack are actually pointers that need updating.

Visualizing the Fragmentation Death Spiral

Let’s look at a simple Rust example that mimics a long-running process—something like a document editor or a game that constantly creates and destroys buffers.

// A simplified example of an allocation pattern 
// that creates "Swiss Cheese" memory.

#[no_mangle]
pub extern "C" fn produce_fragmentation() {
    let mut fragments = Vec::new();

    for _ in 0..1000 {
        // 1. Allocate a "long-lived" small anchor
        let anchor = Box::new([0u8; 1024]); // 1KB
        
        // 2. Allocate a "short-lived" large temporary buffer
        {
            let _temp = Box::new([0u8; 1024 * 100]); // 100KB
            // _temp is dropped here, creating a 100KB hole
        }

        // 3. Keep the anchor alive
        fragments.push(anchor);
    }
    
    // At this point, we have 1000 small 1KB anchors 
    // scattered across the heap, each separated by 
    // a 100KB hole that was once occupied by _temp.
}

In this scenario, we have roughly 1MB of "live" data (the anchors). However, that data is spread across a range of ~101MB. If we now try to allocate a contiguous 500KB buffer, a naive allocator might fail to find a big enough hole, even though we have 100MB of free space!

The allocator might then call memory.grow to get fresh space at the end of the heap. If this happens repeatedly in a long-running module, the linear memory grows toward the maximum limit (usually 2GB or 4GB depending on the browser and 32-bit/64-bit status), eventually hitting an OOM even when the actual data density is incredibly low.

The Choice of Allocator: A Double-Edged Sword

In the Wasm ecosystem, developers often choose allocators based on binary size. For example, wee_alloc was long the darling of the Rust/Wasm community because it adds only about 1KB to your .wasm file.

But wee_alloc is a "stop-gap" allocator. It’s designed for environments where you do a few allocations and then the whole module is torn down. It is notoriously bad at reclaiming memory. It doesn't use complex "binning" or "coalescing" strategies (merging adjacent free blocks) as effectively as more robust allocators.

If you are building a long-lived application—like a Figma-style editor or a video renderer—using wee_alloc is essentially a slow-motion OOM suicide pact.

Instead, you should use dlmalloc (the default for the wasm32-unknown-unknown target in Rust) or even mimalloc. These are heavier in terms of binary size but are significantly more "hygienic" regarding how they manage free lists.

Auditing Your Memory Layout

How do you know if fragmentation is what's killing you? You need to look inside the "black box" of linear memory. Since the browser’s DevTools generally show you the total size of the WebAssembly.Memory object and not the internal state of the C/Rust allocator, you have to instrument it yourself.

Here is a conceptual way to audit this in Rust by tapping into the allocator’s stats (if supported) or by creating a tracking wrapper. Since we can't easily peek into dlmalloc's internal bins without custom C code, a common trick is to use a global counter for "Requested vs. Reserved" memory.

use std::alloc::{GlobalAlloc, Layout, System};
use std::sync::atomic::{AtomicUsize, Ordering};

struct TrackingAllocator;

static ALLOCATED: AtomicUsize = AtomicUsize::new(0);

unsafe impl GlobalAlloc for TrackingAllocator {
    unsafe fn alloc(&self, layout: Layout) -> *mut u8 {
        let ptr = System.alloc(layout);
        if !ptr.is_null() {
            ALLOCATED.fetch_add(layout.size(), Ordering::SeqCst);
        }
        ptr
    }

    unsafe fn dealloc(&self, ptr: *mut u8, layout: Layout) {
        System.dealloc(ptr, layout);
        ALLOCATED.fetch_sub(layout.size(), Ordering::SeqCst);
    }
}

#[global_allocator]
static A: TrackingAllocator = TrackingAllocator;

#[no_mangle]
pub extern "C" fn get_live_data_size() -> usize {
    ALLOCATED.load(Ordering::SeqCst)
}

On the JavaScript side, you can compare this "Live Data" to the actual buffer size:

const wasmMemory = instance.exports.memory;
const liveData = instance.exports.get_live_data_size();
const totalHeap = wasmMemory.buffer.byteLength;

console.log(`Utilization: ${(liveData / totalHeap * 100).toFixed(2)}%`);
console.log(`Wasted (Fragmented): ${totalHeap - liveData} bytes`);

If your utilization is 10% and you just hit an OOM, you have a fragmentation problem.

Strategies to Fight Fragmentation

If you've identified that fragmentation is the culprit, you can't just "turn on GC." You have to change how you think about data lifetimes.

1. Arena Allocation (Regional Allocation)

If you have a set of objects that are all created for a specific task (e.g., loading a frame of a video) and will all be destroyed at the same time, don't use the global heap for each one. Use an Arena.

An arena allocates a large block of memory upfront and then hands out pieces of it. When the task is done, the *entire arena* is wiped clean at once. This avoids creating holes in the global heap.

// Example using the 'typed-arena' crate logic
let arena = Arena::new();
for _ in 0..10000 {
    // These are allocated fast and close together
    arena.alloc(MyStruct { ... });
}
// Everything is reclaimed in one go when 'arena' drops

2. Object Pooling

For objects of the same size that are frequently created and destroyed (like particles in a simulation or nodes in a graph), use an object pool. Instead of freeing the memory back to the system allocator, you move the object to a "free list" within your app. The next time you need an object, you pull from the pool. Because the sizes are uniform, you never create mismatched holes.

3. Separate "Hot" and "Cold" Memories

If your language/runtime supports multiple memories (a newer Wasm feature), you can theoretically segregate data. However, in standard Wasm, a more practical approach is to keep large, long-lived buffers (like the main state of your app) at the "bottom" of the heap and try to keep temporary "churn" at the "top."

4. The "Reset" Pattern

In some high-performance Wasm apps, the simplest solution to fragmentation is to periodically "reboot" the Wasm instance. If you can serialize the state, destroy the instance, create a new one, and deserialize, you effectively perform a manual compaction. It’s a nuclear option, but for some complex web apps, it's the only way to clear a 2GB mess of fragmented holes.

The "Discard" Proposal

There is light at the end of the tunnel. The memory.discard proposal (part of the broader Memory Control proposal) allows a Wasm module to signal to the host that certain pages are no longer needed. This doesn't help with fragmentation *within* the module's logical address space, but it tells the browser's OS-level memory manager that it can reclaim the physical RAM backing those pages.

However, even with memory.discard, your malloc will still see that address range as "occupied" or "dirty" until the allocator's internal logic decides otherwise.

Closing the Gap

WebAssembly is often sold as "C logic at near-native speeds," but we often forget that native C apps on Linux or Windows have the benefit of virtual memory managers and much larger address spaces to hide the sins of fragmentation. In the constrained, 32-bit-heavy world of current WebAssembly, memory management is a first-class architectural concern.

Next time your Wasm module crashes with half its heap empty, stop looking for a memory leak. Start looking for the holes you're leaving behind. Choose a robust allocator, embrace arenas for temporary data, and always keep an eye on the gap between what you're using and what you've reserved.