Does schema migrations is an issue in Datomic?

Hey all, we wonder if it makes sense to track schema changes (like migration in SQL databases).
What are the consequences of transacting the same schema over and over again (without changing it)?
What are the best practices?

If you transact the same schema again, then the transaction will only contain the transaction entity (:db/txInstant). You can use datomic.api/with beforehand to avoid that you transact an “empty” transaction. We do this since we transact the schema every time our application server starts. I would not call it a “best practice”, but it served us well for many years.

However, we program a custom migration, if we need to do anything other than adding something to the schema. Migrations can be a beast since you always need to take care that you don’t block the Datomic transactor for too long. Let’s say you want to migrate one of your customers using a single transaction. If this customer has 100k entities, then the transaction might take a couple of seconds. During this time frame, all subsequent transactions have to wait.

Thanks for your reply. Just to assert I understand, you execute the tx using with and verify if the result is not empty, and if so you transact the schema?

No, on transact it makes a comparison with what it knows already (schema datoms). No changes mean nothing from the schema gets transacted and you’ll just get a new empty transaction back

Yes. Just to avoid that, you create an empty transaction on every server start. But you can ignore this optimization.

We don’t even transact the schema automatically.
We do have nREPL access to our ion servers (via so called REPLions) and do the schema transaction through an nREPL client, when needed.

We also have that optimization to omit “empty transactions”. This is the predicate we use for it:

(defn empty-txr?
  "Determines whether a transaction result (`txr`), returned from
   `datomic.client.api/transact` or `datomic.client.api/with`, is empty.

   `txr` is considered empty if it contains only 1 datom, which is the
   automatically generated timestamp (`:db/txInstant`) of the transaction."
  [txr] (-> txr :tx-data count (= 1)))

then we also have a little helper to make things a bit more concise:

(defn txm
  "Wraps `tx` into a {:tx-data tx} `arg-map`, suitable for `d/transact`,
   `d/with`, unless it's already a map."
  [tx]
  (if (map? tx)
    tx
    {:tx-data tx}))

then we have the conditional transaction function:

(defn ensure-tx!
  [conn txd]
  (let [tx-map (txm txd)]
    (when-not (-> conn d/with-db (d/with tx-map) empty-txr?)
      (-> conn (d/transact tx-map)))))

and another one, which can collect the transaction results of multiple transactions, because it’s likely that your schema is made up of multiple transactions (eg. because u have some seed data too, which needs your domain-specific datomic attributes already transacted):

(defn ensure-txs! [conn txs]
  (transduce
    (comp (map (partial ensure-tx! conn))
          (keep identity))
    conj
    txs))

(keep is a separate step, because it expects its function parameter to be side-effect free, according to its docstring… not sure whether it matters in this case, so it’s like this, just in case.)

1 Like

I forgot to add, that this is a non-atomic solution, so if you have a cluster of machines in your primary compute group, it’s possible that they would be running ensure-tx! about the same time, with the same transaction data, so it’s possible that both of them attempt to transact it and one of them ending up with an empty transaction still.

Usually atomicity can be guaranteed by making a transaction function, but we can’t use d/with within a transaction function, since it needs a d/with-db, which needs a connection, but we only have database value available.

Since ion deployment to clustered primary compute groups happens in a rolling fashion, it’s extremely unlikely, that ensure-tx! calls with the same tx data is executed interleaved somehow still, so it’s an alright solution in practice.