We have written incremental data pipeline on top of datomic. The time concept is perfect for this since basically if you define the used tables & links between tables in code for a transformation you can write a system that automatically knows which entities require an update if a source table changes.
That being said, we noticed in the target tables that it happens that:
- the object itself is updated
- but the indices are not
With other words, if you do a get all on a table, there is an entity with name: “Micky Mouse”. If you filter on name == “Micky Mouse” then you get nothing, until the indices are updated. This is also written in the datomic docs: “Updates the datom trees only occasionally, via background indexing jobs.”
However, is there no way to force the index to be up to date after a certain batch job? If you can’t be certain about the index, won’t people write a lot of hidden mistakes? E.g. if I get all entities of a table that changed after a certain time and I try to link that with another table. Then I store the new date and next time I do the same procedure. What if the indices of that second table were not up to date? Since I’m working incrementally, there is no way I could know.
Or is my understanding of this incorrect?