Can retractions improve performance?

Robin · June 30, 2018, 10:52am

I have a schema where users can have a session/token. My application is a web-app, so when the app recieves a request with a <appname>_session cookie, it looks up the user which has the provided session/token and considers the user logged in if a user is found.

Now, tokens aren’t valid forever. So what I do is to lookup the transaction that added this token, and check if the token was created/asserted within a reasonable time (a week).

The interesting thing here is that I don’t really need to retrat tokens to avoid logging in users with invalid tokens, as I always check the transaction time. But I’m wondering if there are advantages to periodically retracting invalid tokens anyway?

As an example, say I have a user that has logged in one thousand times (and so has one thousand session tokens), would there be any benefit to retract all the invalid tokens when I always check the validity of the token by its creation time?

tim · June 30, 2018, 3:46pm

It’s a good question. Personally I wouldn’t do that. As I understand it Datomic acts similar to an append only log file where the newest entries are accessed first in any querying. So when you retract you’re adding data to the top to nullify its previous entry. IF that’s true then it’s also possible that performance could be worse[1]. Really, I think the Datomic team can provide a better answer for that, but personally I’d be more interested to see if using the ‘since’ filter [2] is a better option in your case.

theoretically, though practically speaking I doubt it would make any real difference.
Database Filters | Datomic

eneroth · July 12, 2018, 4:05pm

Wouldn’t :db/noHistory essentially remove the need for retraction altogether?

Robin · July 12, 2018, 4:39pm

If you never do a retraction (or never assert a new value) then :db/noHistory has no effect.

eneroth · July 12, 2018, 4:53pm

Right. I’m all new to this, so take this with a grain of salt.

I thought, overwriting the session token counts as an implicit retraction, and :db/noHistory would kick in.

Robin · July 12, 2018, 5:22pm

Oh. No, you got the right idea. It’s just in my app, a user can have more than one session token

avodonosov · July 13, 2018, 3:42am

If you don’t retract datoms, the index grows and data access becomes more expensive. Official datomic people say that after 10 billions of datoms the index may become a bottleneck.

So for performance it’s better to remove unused data.

marshall · July 18, 2018, 4:45pm

Datomic is accumulate only, not append only. There are important semantic and performance differences. See https://docs.datomic.com/on-prem/indexes.html#accumulate-only

In particular, Datomic does not pay a performance penalty for the “present” (i.e. all those things that are true now).

As an answer to @Robin’s original question -
Yes, theoretically there may be a slight advantage to retracting old expired tokens. However, I strongly suspect that you will never see the difference in any practical implementation.

marshall · July 18, 2018, 4:48pm

Note that the accumulation of facts means that your total datom count goes up when you issue a retraction.
That said, it’s very unlikely either approach will have any measurable difference in performance.

I would personally retract the expired/invalid tokens just for semantic/ease-of-use reasons - getting all the tokens would also serve to get only the latest/valid ones.

avodonosov · August 4, 2018, 8:35pm

@marshall, I think retraction operations only add datoms to log. The current index version becomes smaller after retraction. So, unless you do historic queries, the active set of datoms a peer deals with becomes smaller after retraction.

Actually, those who want to be sure can just do 10 billion additions and retractions and compare query performance after that to query performance after doing just 10 billion additions. (Instead of 10 billion single datom transactions one can do 10 million of 1 million datom transactions).

Topic		Replies	Views
Does schema migrations is an issue in Datomic? General	6	204	January 29, 2024
Does excision count towards the 1 billion datom limit? Datomic Pro	4	1015	August 20, 2019
Stale data with dev storage protocol? Troubleshooting	2	697	August 16, 2020
Would filtering using d/with to detect change help the transactor? General	2	344	March 23, 2023
Excision is very slow for huge number of entities Datomic Pro	15	1120	June 30, 2021

Can retractions improve performance?

Related Topics