We are getting sporadic “Insufficient memory to complete operation” errors every day. On CloudWatch we can see how the IndexMemMb increases steadily until it gets to about 110MB, at this point is where we are getting the errors, and then it gets released, goes back to 0MB and everything is back to normal.
Is there any way to have the index reset earlier? or any other advice to fix this issue?
The indexMemMb represents the total size of the in-memory indexes on a node. The cycle you are seeing represents the background indexing jobs to copy recent changes from memory to the persistent index, the indexing jobs are triggered automatically when the transactor’s memory accumulates to a certain point. Currently, you cannot tune these settings and we have found them to be sufficient for a broad range of systems and use cases. We should investigate further to see what about your use case in particular, is putting pressure on memory.
Are you running production topology or solo topology? Are you monitoring jvmFreeMB? Do you have any alerts in cloudwatch?
Thanks for the quick response.
We are running a production topology. i3.large instances.
The jvmFreeMB is hitting lows of around 1.3GB but very often, we are not currently monitoring it, what’s strange is that the error seems to happen when the indexMemMb is up (maybe it’s not related).
This is the graph on cloudwatch, we got the error from 11:15AM until after the indexMemMb reset:
@galdolber I’d like to open a case in our support portal. Do you have an e-mail I can use as the point of contact? Once we’ve got more information we can circle back and update this thread with our findings.
To open the case, just private message me your e-mail and I’ll open it for you or you can e-mail support@datomic.com and a case will be generated.
Closing the loop on this thread for anyone who finds this issue. We identified a problematic query and Gal was able to optimize the query so as not to consume too much memory.