Don T. was tasked with adding the ability to process transactions with new reference data to their flagship product. After some thought, Don decided that he would create a DAO to load the new reference data from the database, and store it in another sub-cache in the singleton master-cache of the application. Further, he would grab a handle to that sub-cache and pass it around to anything that needed it. The actual processing would be done by a remote web service. There would be approximately 100 megabytes of new reference data. He wrote and tested the code. QA tested and blessed the code. It was deployed.

Within an hour, the application died with an OutOfMemoryError.

Whoops! Don forgot to adjust the configured heap size for the application to account for the extra data. Simple enough. Forms were filed. Approvals were received. The configuration was changed from 4 to 5 gigabytes in production and the application was restarted.

Several hours later, the application died with an OutOfMemoryError.

Now what?

While Don was walking through his code, the configuration was adjusted to provide a 16 gigabyte heap to have some breathing room while the problem was being researched.

Two days later, the application died with an OutOfMemoryError, and puked a 20 gigabyte heap-profile file.

Hmm...

Don fired up JProbe, waited an eternity while it parsed and loaded the heap profile and discovered that the new reference data occupied about 100 megabytes, as expected. Unfortunately, he also discovered that the cached returned web service responses had consumed 12.2 gigabytes of data.

class RefDataDAO {
    public Map<Long,NewReferenceData> getAllRefData() {
        Map<Long,NewReferenceData> map = new HashMap<Long,NewReferenceData>();
        // Load stuff from database
        return map;
    }
}
class NewReferenceData implements Serializable {
    private long id;
    public void setId(long id) {
        this.id = id;
    }
    public long getId() {
        return id;
    }
    // ...
}
class WebServiceResult implements Serializable {
    private long id;
    private Map<Long,NewReferenceData> localRefDataCacheHandle;
    // ...
    public WebServiceResult(Map<Long,NewReferenceData> cache, long id  /* , ... */  ) {
        this.id = id;
        localRefDataCacheHandle = cache;
    }

    public long getId() {
        return id;
    }

    // ...
}
class Cache {
    private static Cache instance = new Cache(); // singleton
        
    public  static Cache getInstance() {
        return instance;
    }

    private Cache() {
        try {
            masterCache.put("NewReferenceData", new RefDataDAO().getAllRefData());
            masterCache.put("WebServiceResult", new HashMap<Long,WebServiceResult>());
        } catch (Exception e) {
            // log error
            System.exit(1);
        }
    }

    // key=sub-cache-name, value=sub-cache: key=id, value=<whatever>
    private Map<String, Map<Long, ? extends Object>> masterCache = new HashMap<String, Map<Long, ? extends Object>>();

    public Map<Long,WebServiceResult> getWebServiceResultMap() {
        return (Map<Long,WebServiceResult>) masterCache.get("WebServiceResult");
    }

    public Map<Long,NewReferenceData> getNewReferenceDataMap() {
        return (Map<Long,NewReferenceData>) masterCache.get("NewReferenceData");
    }
}
class WebService {   // most details ommitted for clarity
    public void doWork(Map<Long,NewReferenceData> cache  /* , ... */  ) {
        // OP: This puts the cache into the result that it returns
        WebServiceResult wsr = theWebService.doWork(cache, /* , ... */ );
        Cache.getInstance().getWebServiceResultMap().put(wsr.getId(), wsr);
    }
}

... and the relevant portion of the web service implementation (coincidentally implemented in Java):

public WebServiceResult doWorkImpl(Map<Long,NewReferenceData> refDataCache /* , ... */ ) {
    // Crunch data here

    long idPK = ... ; // OP: from a database sequence

    // OP: Construct the return parameter with a reference 
    // OP: to the local copy of the serializable cache
    WebServiceResult wst = new WebServiceResult(refDataCache, idPK  /* , ... */ );

    // OP: Pass back a tiny structure with a 100MB serializable cache,
    // OP: which will be serialized here, and deserialized in the calling JVM
    return wst;
}

For those unfamiliar with Java, Serializable is a marker interface which tells the JVM that the object, and everything that it contains (except transient things) should be streamed.

When the cache was passed to the web service (which was running on another machine), Java couldn't simply pass a reference to the address space of another JVM on another server; it had to serialize the cache, pass it across and deserialize it on the web server. The web service did its thing, constructed the return object, and passed the local handle to the cache to it. When the web service returned the result record, it serialized everything, including the copy of the new reference data cache, and pushed it back to the calling JVM which dutifully deserialized it. Don's code then stuffed the reconstructed web result - with embedded copy of the new reference data cache - into the web service results sub-cache in the master cache.

In other words, every single web service call stuffed yet another copy of the new reference data cache into the master cache, which hung on to it until it could hang on no more.