How to limit the size of the result for analytics query

We are using datomic on-prem with configuration for analytics, which we’ve started using successfully.

In our app code, when we need to reference the time when an entity was modified, we query the history db and join to the transaction entity and work it out from there.

The warehouse guys now want to use a datetime of some kind to limit the result set from presto, so that they are only pulling data into the warehouse which has not been pulled before. Typically this is done by finding the highest datetime in the current warehouse, and then reference datetime higher than that value in the SQL query.

  • How would this problem be solved using the analytics SQL <-> datomic interface?
  • Can a join to the transaction entity be accomplished with analytics and SQL?

How can I import data via the analytics interface into a warehouse in an efficient way?

Hi @pieterbreed

Regarding your question on time:

Without having an additional modeled time in your schema, I don’t see a straightforward way to accomplish your goal. However, I think if we exposed datomic’s t to analytics this would be the optimal solution. As such, I’ve added a feature request for that exact feature.

How can I import data via the analytics interface into a warehouse in an efficient way?

I am not sure what you are after here, could you elaborate on your needs beyond what is provided with analytics? i.e. You can consume the data by writing arbitrary SQL queries.

Hi @jaret, thank you for the engagement.

From a purely selfish perspective, it would be great if every “row” in the auto-generated analytics table could have a column for last/highest t (tx) that affected the entity that row is generated from.

(Some other increasing value would also work, like the #inst of the specific datomic transaction)

With this in place, a warehouse could select max(tx) in it’s internal datastore, and select * from <table> where tx > <my-max-tx>. This would effectively limit the result set to only include data not yet in the warehouse.