Traditional transactions (commit multiple changes at once)

tar · July 27, 2018, 3:28pm

How can changes to be made to a Datomic database be grouped in a way that they either all succeed or they all fail? Think of transactions in SQL, via BEGIN and COMMIT/ROLLBACK.

In a non-trivial system there are some high-level actions that require several small overlapping changes to the data. Having a subset of the changes go through and the rest fail is important to avoid.

I tried forking the db using d/with, then accumulating tx-data with the idea that I could transact them all at some later time. That strategy breaks if the tx-data has conflicting datoms, or involves retractions or other database functions. I also thought of just using d/with repeatedly but accumulating the :tx-data lists from the response of d/with, in order to avoid issues with database functions, but that of course means that the benefit of database functions (acting atomically on the latest db value) is lost, and it still doesn’t solve the issue of conflicting datoms.

Combining changes manually into a single d/transact call is not a valid solution because it implies that I need to write two copies of the same logic - one that is a function of entity->tx-data and one that is a function of tx-data->tx-data.

dustingetz · July 29, 2018, 7:02pm

Does this accomplish what you want without intermediate effects?

(defn step-1 [$]
  ; Each step can query as if the transactions went through
  ; each step must return a dbval for next guy
  [$ [[:db/add "a" :todo/title "buy groceries"]]])

(defn step-2 [$]
  [$ [[:db/add "b" :todo/title "feed baby"]]])

(defn step-3 [$]
  [$ [[:db/add "a" :todo/completed true]]])


(comment
  (def $ (d/db (d/connect "datomic:free://datomic:4334/~dustin.getz")))
  (let [[$ tx] (step-1 $)
        [$ tx'] (step-2 (:db-after (d/with $ tx)))
        [$ tx''] (step-3 (:db-after (d/with $ tx')))]
    [$ (concat tx tx' tx'')])

  #_[datomic.db.Db @e8096e87
     ([:db/add "a" :todo/title "buy groceries"]
       [:db/add "b" :todo/title "feed baby"]
       [:db/add "a" :todo/completed true])]
  )

(defn comp-step [f g]
  (fn [$]
    (let [[$ tx] (g $)
          [$ tx'] (f (:db-after (d/with $ tx)))]
      [$ (concat tx tx')])))

(comment

  ((comp-step step-2 step-1) $)
  #_[datomic.db.Db @de8957c1
     ([:db/add "a" :todo/title "buy groceries"]
       [:db/add "b" :todo/title "feed baby"])]

  ((comp-step step-3 (comp-step step-2 step-1)) $)
  #_[datomic.db.Db @2d470e30
     ([:db/add "a" :todo/title "buy groceries"]
       [:db/add "b" :todo/title "feed baby"]
       [:db/add "a" :todo/completed true])]

  ((reduce comp-step [step-3 step-2 step-1]) $)
  #_[datomic.db.Db @75af774c
     ([:db/add "a" :todo/title "buy groceries"]
       [:db/add "b" :todo/title "feed baby"]
       [:db/add "a" :todo/completed true])]

  )

dustingetz · July 29, 2018, 7:09pm

As you mention, like git branch merges, you are vulnerable to “merge conflicts” which are essential complexity and are your problem to resolve. But I dont know if we are talking about the same thing because in my version, :db.fn/retractEntity and transaction functions work (they will all evaluate again atomically in the transactor). What type of merge conflicts are you seeing? I dont think duplicate statements that assert the same thing are a problem, and upsert helps too. You may also need to deal with tempid reversing, but that can be handled inside of comp-step abstraction i think.

tar · July 29, 2018, 7:54pm

Hi Dustin, thanks for the idea. You seem to have a good handle on the problem.

I wrote something similar on Thu/Fri and found it worked for simple cases but not if I caused “conflicting” datoms. I was/am thinking of the case where maybe there’s a status field on an entity that gets set to state A but some other logic overrides it to state B, and because those two things happen in the same transaction they fail because of a perceived conflict.

I was also worried that tempid->entity-id replacements would be impossible to reverse if they were hidden inside a database function, and I wasn’t happy writing a critical layer of software without making it handle all cases. Your use of the term “merge conflict” is convincing me that the number of problematic cases is actually smaller than I had feared.

To Datomic authors/maintainers: why not chime in? I don’t accept that writing transaction merging is a developer’s responsibility. Having proper transactions, like in SQL, is a requirement for me to promote Datomic as a real-world tool in the future.

dustingetz · July 29, 2018, 11:37pm

Can you demonstrate why the system must be factored like this in the first place? I would try to refactor it to this: (concat (step-1 $) (step-2 $) (step-3 $)) and if this can’t be done i am pretty keen to understand why not

tar · July 30, 2018, 12:39pm

A simple concat like that doesn’t play well with the map form (where concat would need to be replaced with a deep merge). Also it could fail at an inconvenient time. One reason people use Datomic is because it claims to have a solid foundation, and I wouldn’t want to layer a “works most of the time” solution on top. Yes, it’ll be fine for a few non-overlapping datoms in list form, but I expect that most people enjoy throwing nested maps at d/transact and never needing to use the list form.

As for a concrete example, it’s not relevant because I’m not trying to implement a specific transaction, I’m trying to make a system that allows for code reuse and doesn’t shoot my successor in the foot. I am asking for composable changes, that’s it, that’s all.

I understand that the transactions I’m asking for aren’t provided by Cognitect. I’ll implement them myself but with the feeling of doing the equivalent of writing homemade encryption

That being said, I don’t think Datomic should complain about “conflicting” datoms in the first place, which might be the real issue here; it should just do the reasonable thing and transact them anyhow, using the order of datoms passed to d/transact as the tiebreaker.

benfle · August 8, 2018, 4:42pm

Yeah, it might be tricky to properly merge your transactions. I don’t think I would try to do that, although it should be possible.

Could you assign a unique identifier to all the Datomic transactions in the same “group” and have a function to revert all the txs in a given group? It won’t be atomic but I have used this method in the past when dealing with logical txs made of multiple actual Datomic txs.

avodonosov · August 8, 2018, 10:38pm

@tar, transaction functions in Datomic are the “traditional transactions” you are looking for. The allow to group several operations, like read, modify, wite, into a transaction. Database Functions | Datomic

Why don’t you use them?

in order to avoid issues with database functions

What issues?

tar · August 9, 2018, 12:56pm

benfle, that could work, and it’s pretty low effort. Thanks.

tar · August 9, 2018, 1:02pm

avodonosov, I was describing my attempt to merge multiple transactions before calling d/transact. Because function calls are opaque, it wouldn’t be possible to walk them and replace tempids as necessary. That being said, you have a good point, why not use db functions more, to the point where I have a large chunk of code in Datomic? No real reason beyond source control maybe, and transactor performance. But if I made the functions quick and pure, maybe it could work! Thanks, I’m going to see what I can do with that idea.

dustingetz · August 9, 2018, 3:03pm

I am asking for composable changes, that’s it, that’s all.

I think you’ve asked for composable effects, which is the root of the cognitive dissonance here.

Transaction map sugar works with concat:

(concat
  [{:db/id "a"
    :order/lineItems [{:lineItem/product "chocolate"
                       :lineItem/quantity 1}]}]
  [[:db/add "a" :order/lineItems "b"]
   [:db/add "b" :lineItem/product "whisky"]
   [:db/add "b" :lineItem/quantity 2]])

(The seattle schema similarly mixes vector and map form)

If the application code is generating collisions you can compensate here, for example transforming the transaction value to vector syntax and introducing a notion of statement ordering as you suggested above.

tomc · October 24, 2018, 4:48pm

I dealt with the same issue. Dustin, your suggestion regarding introducing a notion of statement ordering is more or less what I ended up with.
I’m posting this here in the hopes that it’ll help out @tar and that someone can look at the solution and let me know whether it seems reasonable. Like tar said, I feel a bit like I’m writing homemade encryption.

Here’s my use case and the issues I ran into:
My server accepts “commands” that look like re-frame events. For example:

[:artist/create]
[:artist/set-name artist-id "new name"]

Each command is mapped to a function that produces a sequence of facts, as could be passed to d/transact:

(def facts (command/get-facts [:artist/create]))
@(d/transact datomic-conn facts)

This works fine when handling a command at a time, but it is a requirement of my UI that there is an explicit save step and that all commands that happen between saves are saved at once and succeed or fail as a unit. Handling batches of commands reveals the conflicting datoms issue that has already been discussed and another problem I found a bit more vexing: tempids need to be consistent across the all the facts for a command batch. If we create an artist with :artist/create, then set that new artist’s name in one transaction, set-artist-name facts need to use the right tempid for the artist.

Ignoring the application-side code that handles making the fact-sequence-per-command, my solution was to add a transactor function that accepts any number of fact-sequences and converts them to a single fact sequence.

I pasted our datomic util lib into a gist (not putting it in a public repo just yet):

gist.github.com

https://gist.github.com/tomconnors/4cdb5f8142e117fa2da8905dbbddf457

datomic_util.cljc

(ns kc.datomic
  "Datomic utility functions

  Usage Notes:

  Some functions in this namespace take sequences of facts and return them modified in some way. Some up-front modifications are useful for those functions, like replacing all map-form facts with vector-form facts. In order to avoid doing these modifications repeatedly to same the same set of facts (which would be harmless but wasteful), two versions of these functions exist: a \"safe\" version that does those up-front modifications, and an \"unsafe\" version that expects those modifications to already have been performed. The unsafe versions are named like the safe ones, but with a single quote appended.

  TODO:
  - consider implementing all fns that branch based on operation as multimethods
    These fns mostly support :db/add, :db/retract :db.fn/retractEntity, :db.fn/cas,

This file has been truncated. show original

A lot of this code isn’t relevant to what we’re talking about. The part that matters is the last fn, transact-many-validated. I installed that as a transactor fn and do all my command-related transactions through that, like (d/transact [:kc.tx-fns/transact-many-validated sequence-of-fact-sequences]). The heart of the implementation is in -simplify-fact-seqs.

You could make your own transaction fn that doesn’t have our special FK and retraction logic - it would look like:

(defn transact-many-validated
  "Given any number of fact sequences, reduce them to a single fact sequence without any conflicting datoms."
  [db fact-seqs]
  (let [fact-seqs (map clean-facts fact-seqs)
        fact-seqs (simplify-fact-seqs db fact-seqs)
        facts     (merge-facts db fact-seqs)
        facts     (remove-tempid-retractions facts)]
    facts))

dustingetz · October 24, 2018, 7:17pm

Hyperfiddle has the same(?) requirement, we have a function into-tx which is basically concat for tx (but cancelling out add/retract statement pairs). It is pretty short, can you think of a way to break it? https://github.com/hyperfiddle/hyperfiddle/blob/aea5b300bd4abc837344e94c0965b189f553d076/src/contrib/datomic_tx.cljc#L21-L43

For tempid reversing: Hyperfiddle maintains a branch stack (cascading d/withs). Views always see the reified ids, consistent with the time-basis of the result. Branches are a datomic transaction value, and are stored with the top layer of tempids reversed. When a branch is merged into parent (in response to a user gesture), we 1) reverse the next layer of tempids, 2) “concat” with into-tx. Since we never need to look at reified tx entities – we always hang on to the transaction as specified by the programmer/user – db fns don’t need to be accounted for. Everything is pretty orderly and idiomatic.

Highly interested if you can break this pattern.

Topic		Replies	Views
How nubank operates without traditional transaction features like commit/rollback? General	3	1147	November 14, 2020
Does schema migrations is an issue in Datomic? General	6	314	January 29, 2024
Ingest historical data from some log files into datomic (maybe local) and preserving the transaction time Datomic Applications	1	32	October 12, 2024
Same infra, multiple transactors? Datomic Pro	1	792	April 17, 2020
Using d/with on past database value from d/as-of Peer API	1	1929	May 3, 2020

Traditional transactions (commit multiple changes at once)

Related topics