I have a db with 250 attributes. The whole thing is a couple hundred MB. Most queries and transactions are generally performant except when they’re not - at all.
For example, one entity type has a unique string id field, say :myns/-id. There are only about 30 of these entities in the db, yet a simple query based on that id field takes 250 milliseconds to return.
I have tried:
adding :index true to that field
running the query a few times to make sure the data is cached by the peer library
local transactor (macbook pro, light load)
AWS transactor backed by DDB (t2.medium instance)
local peer library
AWS peer library with m2.large instance, 1.5G object cache, and 1G memory index
This is not the only slow query I am seeing, but it’s the hardest to explain.
What factors affect query performance that I haven’t mentioned? Why could a simple indexed one-attribute lookup take 250ms?
From our slack conv, I know that your slow queries are all constant string lookups. Try using ground. If ground fixes it, I would like to better understand why Datomic needs this hint?
Also if your id strings are random, you may also be hitting pathological index access patterns that can happen in On-Prem, try using squuids (sequential uuids). This could also explain your high memory usage in the other post – random access of the id index may be forcing the entire index into object cache as described here: http://www.dustingetz.com/:datomic-performance-gaare/
Speculation: the docs suggest that Datomic query has a “prepping” phase which is memoized. I speculate it’s reading the datalog edn and compiling any clojure forms etc (Just a guess!), which is for example why we want to reuse the same static queries rather than generate them dynamically.
This is dynamically constructed new query for each userid, so better would be to pass the userid through the :in clause. This I believe is exactly what ground accomplishes in effect.