Are there any practical ways to impose resource limits on individual queries, without resorting to dedicating a JVM to the query? In general I’ve found that a poorly constructed query can very easily take out an entire Peer, and the chaos that follows makes it difficult to track down the source of the issue. The scenario usually goes something like this
- Misbehaved query is executed on peer. It attempts to do something that quickly exhausts available heap, but also produces copious amount of garbage which can be collected.
- The JVM goes into a tailspin, as more and more CPU cycles get consumed by the garbage collector. This isn’t always a simple “out of memory” condition…the working set of the query might be quite small, but it is doing something exponential in terms of the data it touches, produces garbage as fast as it can be collected
- The CPU load causes other things to time out, connections to get dropped and “Transactor Unavailable” exceptions start popping up all over the place, not really related to the bad query.
Are there any good “on JVM” ways of containing the impact that a single query can have? A timeout on its own doesn’t really help here, since the one query that “brings down” the JVM tends to make everything take forever. Given that all the “work” that queries do occurs in shared datomic controlled thread pools, I’ve found it generally difficult to associate a given poor behavior with the query that caused it.