Peer Query Performance

We are using Datomic to store Time series data. We get the data hourly.
We have around ~40 billion datoms
When we query ,we try to load the data for last ~5 days.We have defined indexes for data and we use date range based queries.
At times , some query take up to ~1 min.Problem is its very inconsistent
Sadly, We realized Datomic is not optimized for such huge amount of data.

I read this article that give some information about how the data that is stored in Peer cache
http://www.dustingetz.com/:datomic-performance-gaare/

I had couple of Queries:

  1. Is there any way we can identify the levels we need to go to query data.Subsequently , number of trips to storage?
  2. Is there any direction where we can work on to optimize it?
  3. why there is inconsistency in time?Sometimes it s fine sometime just take too much time?

Any help will be appreciated.
Thanks

“Time series data”, this is called out in various places as one of the things that Datomic is not well suited for. You might want to consider a store that’s more geared to that use case, and potentially say use Datomic for ‘projections’ of that data, etc. depending on your needs

40B datoms doesn’t sound that bad, I believe Nubank and HCA both rollover to a new database every quarter or so when they reach the limit and there is a crossover technique, see this talk for the details: http://2018.clojure-conj.org/igor-ges/

Thanks Dustin. The video was helpful.It mentions about sharding and roll over.I had watched Nubank presentation on Infoq , they also mention Sharding.
Is there any way we can identify how many trips are made to storage to complete a query?
Thanks