Data non-retention and redaction


I’m currently evaluating Datomic for some upcoming projects and while it seems like an excellent fit in many regards, I am getting a bit hung up on the indelibility. While there are many contexts in which this sounds like a tremendously powerful and useful feature, I generally find that the ability to forget things is just as important as the ability to remember them, if for different reasons and in different contexts. Has anyone else grappled with the security, privacy, and compliance implications of this property?

I can add a bit of context to make this a little less abstract. One project I have in mind is a kind of case-management application for nonprofits working with (potentially) vulnerable populations. In general, the history-preserving behavior of Datomic is as useful here as anywhere. However, once the relationship with a client ends, much of the personal information rapidly becomes the equivalent of toxic waste: it has no analytical use and if the system were ever compromised, it could be used to harass or harm the individuals. Losing control of a couple month’s worth of PII is bad; losing control of ten year’s worth is catastrophic.

That’s a somewhat specific case, although variations of this come up in a number of other contexts. Customer support applications such as Zendesk will have a feature to redact passwords, credit card numbers, and any other sensitive data that people occasionally put in email. And, of course applications that operate under security- and privacy-focused compliance regimes such as PCI will be very concerned with ensuring that they never have data they’re not supposed to have (including after the fact).

For the time being, is it reasonable to say that any application that wishes (or is legally required) to protect data through non-retention or redaction is simply outside the scope of Datomic? Has anyone else wrestled with this issue and perhaps found creative solutions? Is there any reason to think that this might change in the future?