Architecture: Transactor / Persistent, durable tree

The datomic transactor does not acknowledge a write until it has been validated and durably persisted to the Log. Since underlying storage implementations vary, the specifics of “durable” depend on the underlying storage, but generally it is expected that these writes are fully durable and replicated (e.g. when DynamoDB acknowledges a write it is “synced” to disks in multiple AWS AZs).

Index rebuilds (which, once complete, would swap out the root tree node) are done periodically, e.g. based on the amount of novelty accumulated in the log. In terms of “write amplification”, I guess it depends on what you mean. Datomic indexes are covering and every datom is written to 3-5 indexes (Always Log, EAVT and AEVT, sometimes AVET and/or VAET). Plus every transaction itself generates at least one datom for the transaction…so strictly speaking there is a lot of “duplication” in the storage layer, but the IO cost of most of those writes can be deferred to when the indexing job occurs. Only the write to log “blocks” the transaction. Similarly, given the per-transaction overhead, it is more space efficient to write two datoms in one transaction than in two.

On the read side, each Peer is responsible for combining the latest tree-version they have with any novelty accumulated since then (which they keep in memory). One of the first things a peer does on startup is initialize this in-memory structure from the durable Log. From that point, Peers stay up-to-date by subscribing (i.e. via activemq) to the transactor, which publishes new log entries as they are acknowledged. It isn’t necasary to rebuild indexes anywhere near every 100ms…the rate at which these rebuilds occur is primarily governed by the size of novelty that is in the log but not in the other indexes.

The docs cover this process here: https://docs.datomic.com/on-prem/indexes.html#efficient-accumulation