I’m working on a query pattern where I pass a large number of entity IDs into the query as a collection binding. Here’s a simplified version of the query:
When the number of IDs is relatively small (e.g., up to 100,000), the query executes within a reasonable time (a few seconds). However, once the input size increases (e.g., around 1,000,000 IDs), I observe a drastic performance slowdown. Eventually, the query fails with a Java heap space overflow, which I assume is due to memory pressure during query execution.
I’m trying to understand: Is there a known upper limit or best practice for using large collection bindings like [?e ...]?
Any advice or experiences would be appreciated — thank you!
where ?MaterialImpl1 ends up containing exactly the same set of entity IDs as in the problematic case, the query executes much faster and is more memory-efficient.
The issue is that it’s not always possible to construct the query this way, since the set of entity IDs bound to ?MaterialImpl1 may come from multiple places in the application logic, not just from a single [?_ :someAttribute ?MaterialImpl1] clause. In those cases, we are forced to pass a large collection as an input binding, which causes the performance degradation and memory pressure described in this thread.
@jaret I’d really appreciate it if you could confirm whether an efficient solution for this problem exists.
P.S. It’s quite surprising that a query operating over the exact same data set can have such a significant difference in performance.
I’m sorry, my mistake. I tested in debug mode, and apparently, when I executed a query with about one million elements in debug mode, it took much longer for some reason than when I performed the same action in normal mode without debugging. Thank you for your attention.
Yes, that’s correct — I was testing in debug mode using IntelliJ and checked both scenarios mentioned in my post. The one where the query accepted around a million records was indeed significantly slower. However, when I prepared a clean and minimal example in the REPL to share with you, everything turned out to work fast.