Memory use well beyond expected

I’m running a Clojure app in EC2 that uses the peer library to connect to a Datomic transactor also running in EC2 and backed by DynamoDB. The data stored in Datomic is only a couple hundred MB worth, and DynamoDB confirms that.

The app instances had 4GB of RAM, the JVM was given 3 of them and Datomic was configured with a 1GB object cache and a 512MB memory index. Once put under load, the systems would quickly report being out of memory to AWS. Connecting to the app EC2 instance with VisualVM, I saw that the memory was being held by threads owned by Datomic. Huge amounts of memory, well above 1.5GB.

I increased the instance size to an 8GB RAM configuration, and bumped the object cache to 1.5GB and the memory index to 1GB. Since then I have never seen it run out of memory, but I need to ask:

  • why does it seem that Datomic used well over its configured amount of memory?
  • it seems like the min reqs for a stable Datomic peer is 8GB when I would expect to get along fine on 2 or 4, surely I’ve set something up wrong to cause that?

Because the peer library doesn’t provide flow control or limiting on your use of Datomic-based resources, if you don’t include throttling or queuing in your application logic you can often overwhelm the JVM with the volume of requests to Datomic.

What work is your application performing? Is it possible you have unbounded requests being serviced concurrently?

Hi marshall, the application is a backend server with HTTP endpoints. The load so far is only about 10-50 requests per min, and each one typically only does one or two queries/transactions.

Are you saying the configuration options are ignored, that they’re just the starting value?

Datomic will reserve the specified amount of heap for the object cache and the memory-index.

However, your application will (presumably) use Datomic API calls for the work it performs (i.e. query/transact/etc). The resources used by these calls is separate from the reserved memory used by the object cache and the memory index.

Hitting OOM errors in a peer application is usually the result of hanging onto the head of a lazy seq or performing a very large join.