A batch processing job ran into an issue where it overwhelmed it’s query group. It was processing large queries with an occasional transact on too many threads. The datomic instance was pegged to 100% CPU usage, so it’s pretty clear what happened. Oddly and repeatably, the queries never timeout, but the transacts do eventually give a :cognitect.anomalies/unavailable.
We ultimately solve the problem by reducing the number of threads our batch job is running. However, we’d like to know if we can also avoid the transact problem by issuing the transacts directly to the primary compute group rather than through the query-group (which I assume just forwards them anyway)? The primary computer group is under practically no load. However, we’re not sure if the datomic implementation would cause and read-after-write consistency errors, as we often issue a query directly after the transact to fetch back the same information we just transacted.