Microsoft SQL Server and Oracle Database have query result set change notifications where the caller can register the query and receive notifications any time the results of the query change. I actually discovered these features because I have a problem I’m trying to solve and in my imagination I thought it would be good to be notified when the results of a query change.
Does Datomic have an ability to do anything like this?
The motivation for such a feature (for me) is as follows: I have a collaborative problem solving environment for biological modeling where there is a desktop application that loads data from the cloud. Say multiple users are editing the same biological model and one user changes the model and commits their changes to the cloud. I want the other users to be notified of the changes to the model - and what exactly is done with those changes is not that important for this post but they may be loaded automatically, or shown as a diff to the user, or the user may opt to load or ignore, whatever.
I have no idea how to efficiently do this, though I have no trouble coming up with ways to do it. One way I know how to do it is to rerun the queries to get all the model data for every open model for each user each time a user saves. I can trim down that list based on what was saved, but ultimately, I need to rerun the queries.
If Oracle and SQL Server are doing just that, then I suppose I can do it myself. But I’m imagining there are more efficient ways to detect if a query result has changed than to rerun it every time - maybe I’m wrong.
Does Datomic support anything that helps in cases like this?
We do not have a first class feature here, but you could use the tx-report-queue or poll the transaction log to implement one in user space.
The general idea is you can monitor the tx-report-queue (see docs and api) or the transaction log (see docs) and then you will need to leverage your domain knowledge to know what query you need to “re-run.”
Working in your favor here is that Datomic is immutable and you are running a desktop application where your queries should be local (cached? you could add valcache if you want to make the desktop lightening fast).
I was chatting about this with the team and one team member recommended this paper on "Automatic Incremental View Maintenance for Rich Query Languages". If you’re interested in fully sinking your teeth into this problem (full disclosure, I have not read the paper ).
I have added a link to this forum post to our internal stories around features in this area and if we make a first class feature in this space I will be sure to update you, but it’s in the backlog/wishlist.
With Datomic, as Jaret already mentioned it should be easy to do something similar. Any transaction will tell you which datoms actually changed, meaning that later you can use this information to figure out which of your queries could need updating.
I did know about transaction logs and the report queue in Datomic, but I didn’t consider monitoring this as a way to detect - through my own logic - that a query result has changed. I can do that and it would satisfy my need in an efficient way. I had just filed transaction logs away in my memory as a database robustness and failsafe feature that also might add some performance benefits.
I’m exited to read the paper on automatic incremental view maintenance.
re-posh and DatSync (linked to from re-posh) are exactly what I need for part of the solution… if they were for the JVM. They don’t address the main concern about detecting query changes, but indeed I do also have the problem of syncing application state and database state.
Just to reiterate… I have a desktop application that connects to a Datomic database in the cloud through HTTP using AWS lambda fns (through Pedestal on the backend). So the application frontend does not have the database locally, nor does it have the ultra fast access to the database that a local network connection or SSD would give. In fact, it may be as slow as whatever the Internet connection is.
Either way… the ideas you two gave me help give me some directions to pursue. Any solution I implement would almost certainly be generalized. I cannot imagine using my domain specific knowledge in some specific way that gives a non-general solution. We have a lot of different queries and structures coming back from those queries. Some are quite complex.