Excision is very slow for huge number of entities

Hi, my task is to determine all entities which were existing before but are missing in the current version of DB and finally say goodbye to all of them. For example I have 1 million of such entities, all of them are mapped into list of objects made from template {:db/excise e}. Then this list of one million items is passed to transactor. What I see that only one CPU of 8 is used. Are there any ways to make the process faster, 7 hours went, the excision process is still running.

P.S. When I tried to excise the whole DB passing all attributes it took only 1.5 hour. Yes it means to delete all DB, but anyway why partial excision tooks more time then excision of whole db

P.P.S. The DB under datomic is H2

What Iā€™m doing in words of code:

(def get-deleted
		(d/q '[:find [?e ...]
		:in $ $hdb
		:where
		[$hdb ?e :IModel/uid ]
		(not [?e :IModel/uid])
		]
		(d/db conn) (d/history (d/db conn)) )
		)

(defn makeExciseSt
  [e]
  {:db/excise e})

(def exciseSt (map makeExciseSt get-deleted))

(def trResult (d/transact conn exciseSt))
(def trResultData (.get user/trResult))
(def dbAfter (:db-after user/trResultData))
(def afterBasisT (.basisT dbAfter))
(.get (d/sync-excise conn afterBasisT))
(d/gc-storage conn (java.util.Date.))

Datomic on-prem 1.0.6269

1 Like

Hi @nikolayandr

What are you timing? Everything in the code block? Am I correct in understanding you are trying to identify retracted entities and then excise them? Can you explain why? If you retracted these facts why are you interested in excising them?

P.P.S. Then DB under datomic is H2

Additionally, you are using the dev protocol which does not have the same performance characteristics as other production storage protocols. Are you planning on using this same strategy against a production database?

I should point out that excision was designed specifically to meet the following two scenarios and should be an extremely infrequent operation:

  • removing data for privacy reasons
  • removing data older than some domain-defined retention period
  1. Excision is lossy. You cannot ask ā€œwas this datom in the database prior to an excision?ā€ (that would defeat the purpose of excision.) Excision | Datomic

  2. Excision creates a substantial burden on background indexing jobs in whose execution time is proportional to the size of the entire database, leading to back pressure and reduced write availability.

https://docs.datomic.com/on-prem/reference/excision.html#performance

Am I correct in understanding you are trying to identify retracted entities and then excise them?

Right

Are you planning on using this same strategy against a production database?

No, dev mode only for local development, postgres for production use

Can you explain why? If you retracted these facts why are you interested in excising them?

Would say a lot of data were created in DB by mistake what has lead to increasing memory requirements, since every megabyte is payable we need to rid of unnecessary data :slight_smile:

Hi @nikolayandr , can you double-check that Datomic version on both the transactor and the peer(s)?

Thanks!

Sorry, my mistake: datomic version is 1.0622, the above example is run inside repl which comes with datomic, so both are on the same version

Sorry, my mistake: datomic version is 1.0622, the above example is run inside repl which comes with datomic, so both are on the same version

@nikolayandr before we delve too deeply, we made a big performance improvement on excision in the latest release:

https://docs.datomic.com/on-prem/changes.html#1.0.6269

Can you upgrade your peer and transactor to the latest and retest?

Cheers,
Jaret

Yeah, it helped to make the excision process much faster, I saw alternation between high CPU usage and disk read(much bigger in comparing to the previous library) but I have faced another issue: 6288656 entities were sent to the excision process, but after the end of the process I still see 6192721 of entities which I have expected to be excised

P.S. Iā€™m trying to do the same actions second time to check the number of excised entities after retrying

P.P.S. Peer has been restarted, also I have executed requestIndex, but judging by CPU usage/disk read/sync-index nothing happened.

P.P.P.S. Is a reindexing the part of the excision process? The doc says:
At some point after the excision request, an indexing job will run.
So when I run sync-excise, does it wait the finish of that indexing job? I just look for a way which give me the answer: everything is done, all entities to be excised are already wiped from everywhere

P.P.P.P.S I have restarted the transactor when the future returned from sync-excise has provided ready db(what was the indication for me that all all activities completed)

An Interesting result I have got after the second run of the excision process. There are still the same number of entities which actually were expected to be excised: 6192721.

Futhermore interesting: I have checked one of entities, and I see that datomic has marked this entity as excised(we see two ids of excising because I did two attempts):

and this entity is still found in history DB:

P.S. The peer restarted, gc storage executed, requestIndex run(but still the same no effect, nothing happened)

@nikolayandr what are you trying to tell here? I am very confused when you use terms like ā€œnothing happened.ā€ Can you share what you are expecting to have happened when you run excision vs what actually happened?

The screenshots you show seem to show that you are looking at the :db/excise attributes. Per our docs (Excision | Datomic):

Note that the excise attributes themselves are protected from excision, so there is no way to ā€˜erase your tracksā€™. Every excision creates a permanent record. This helps preserve a fundamental value proposition of Datomic - it is a database that greatly facilitates your knowing why it is in the state it is in, and how it got there.

Thus excision strikes a delicate balance between forgetting, and remembering that you forgot.

Does that explanation match your observatrion?

@jaret
Right, Iā€™m showing that datomic has marked these entities as excised(what is expected) but actually they continue existing in history DB(what is not expected). I thought that it is necessary to request indexing after ending the excision process but the DB still has these entities(looks like nothing happened after requesting index, no index job has started)

In another words: Is it expected that some number of entities sent to the excision process might be still available in the history DB once the excision process has ended?

1 Like

Guys, any thoughts on that?

Hi @nikolayandr,

I did some quick testing and was unable to see the same results. Would it be possible for you to make a gist of a small reproduction of the behavior you are seeing. I think you have most of the steps in this thread, but it would be nice to see a complete gist with an eye towards formatting the gist as we describe in writing a datomic problem report (i.e. environment, expectation, actual results etc).

More specifically, you say:

actually they continue existing in history DB(what is not expected).

Do you mean the counts or are you actually able to query the data? Is this only in console that you see this? I am hoping you can demonstrate this issue using the REPL.

Do you mean the counts or are you actually able to query the data? Is this only in console that you see this? I am hoping you can demonstrate this issue using the REPL.

Iā€™m able to query data in history DB. The test was done in REPL packed in datomic distribution package.

I did some quick testing and was unable to see the same results.

I starting to think that it might be issue related to our data too. So reproducing can be possible using our DB

  • Environment : Ubuntu 20.01, Datomic on-prem 1.0.6269
  • History :
  1. Start clear datomic: ./bin/transactor -Xms20g -Xmx20g /opt/datomic/config/transactor.properties
  2. Restore DB: ./bin/datomic -Xmx1g -Xms1g restore-db ā€œfile:ā€ "datomic:dev://localhost:4334/testExc?password=datomic"
  3. Restart transactor
  4. Run the next script in REPL:
(require '[datomic.api :as d]) 
(def conn (d/connect "datomic:dev://localhost:4334/testExc?password=datomic")) 

(def get-deleted
		(d/q '[:find [?e ...]
		:in $ $hdb
		:where
		[$hdb ?e :IModel/uid ]
		(not [?e :IModel/uid])
		]
		(d/db conn) (d/history (d/db conn)))
		)

(defn makeExciseSt
  [e]
  {:db/excise e})

(def exciseSt (map makeExciseSt get-deleted))

(def trResult (d/transact conn exciseSt))
(def trResultData (.get user/trResult))
(def dbAfter (:db-after user/trResultData))
(def afterBasisT (.basisT dbAfter))
# The question: Does sync-excise  wait on finishing index job which is run in result of excision
(.get (d/sync-excise conn afterBasisT))
(d/gc-storage conn (java.util.Date.))
  1. Restart all peers(including REPL too)
  • Expectation : All entities which were in get-deleted do not existing anymore in datomic
  • Actual : Some number of entities from get-deleted are still available in history DB.
  • Evidence : Iā€™m still able to query entities from get-deleted in history DB
  • Impact : Canā€™t decrease the size of db to make peers consume less RAM

@nikolayandr Apologies for not being able to immediately review I plan to look at this today and Friday. I noticed that you had previously provided the backup file. If you want to send that confidentially you can share that with me via support@cognitect.com.

1 Like