Heartbeat failed - PostgreSQL backend


#1

Hello, we’ve faced with the following issue in our Datomic environment:

For our app we use PostgreSql 9.6 as an external storage for Datomic database. We use Datomic-Pro v. 0.9.5561.54. Besides of that we have the REST-api client in order to communicate with Datomic from non-JVM based applications.
Our environment is beind deployed in Kubernets cluster, so each Datomic component is being run inside docker container (k8s pod).
A few days ago we executed a read-query which supposed to return a large set of records. We made the query from .NET environment to REST api cient.
However, possibly because of non-optimized datomic query or wrong memory-configurations for PostgreSql\Kubectl pod we encountered with “Out of Memory” exception in postgres container pod and possibly the process was killed by Linux OOM-killer.
After we had tried to restart the pg storage container, we figure out that there was a mess in postgres transaction log (WAL), so pg server was reporting about “panic” and couldn’t be started (https://stackoverflow.com/questions/8799474/postgresql-error-panic-could-not-locate-a-valid-checkpoint-record).
We had to proceed with pg_resetxlog command (without -f) to clear log from partially applied transactions.
After we had started the postgres storage server, we wasn’t able to connect to it from the transactor. I attached an image with transactor logs that we had after we had started the storage. It seems something has been corrupted during the crash.

We tried several ways to recover our data:

  1. We did pg_dump backup for corrupted storage, - when we restore it to another db, transactor connects but seems like there are no data (datums) in datomic db, just databases names were remained.
  2. We tried to redeploy transactor and other datomic nodes several times in different order
  3. We tried to take Datomic-native backup via backup-db tool but it fails with java.lang.NullPointerException

So the questions are:

  1. Is is there any way to recover datomic external storage with the data that we had before the crash?
  2. Why Read-Only query to REST api client writes something to postgres transaction log? I understand that wrong memory settings on kubectl\postgres is our fault, but I’m kinda worry that 1 read query can kill database.
  3. Could you suggest something to prevent such situations in the future ? I know there are recommended setting for transactor in official docs, maybe you also have something related for external storages?

#2

Running queries does not write to storage. The work of performing the query all occurs on the peer instance (your REST server in this case).

The OOM issue you encountered with your storage is likely unrelated (or only peripherally related) to issuing the large query.

Do you have a backup of the Datomic (or postgres) database from prior to this issue that you can use to restore?


#3

Thanks for your response.
Unfortunately we have no backups prior to the crash. Good thing is that it is our development database that we can restore by our own. But we need to be prepared to such cases in production, so that’s why I asked about possibility to recover the corrupted storage.

Could you please explain a little bit how Datomic Transactor works in case we send a request to Datomic REST api client? I’m pretty sure that we wasn’t executing any write operations prior to and during the OOM crash.
As I can see in datomic-postgres-storage, there are a few human-readable rows with ids like: pod-coord, pod-catalog, pod-log-tail, ref-index-root,ref-gc-root and so on. I suppose such rows (at least pod-coord) may be updated by the Transactor depending on its current ip\dns location when transactor restarts. During the oom crash in k8s cluster the Transactor could also be restarted. Can renewal operations of these human-readable rows be a reason of datomic-postgres-storage corruption ?

By the way, besides of .NET app -> Datomic REST client communication, we also have Java app -> datomic.Peer library communication (for the same datomic-postgres-storage). But the Java app also was executing only read operations prior to \ during the crash. We didn’t send any write operations that time.

So could you give us a view of how the Transactor works in these 2 flows and if\when it does any updates to the storage ?


#4

All Datomic peers (including the REST server) perform the computational work of query in process. They read directly from storage to retrieve the necessary segments of the indexes to satisfy the query. The Transactor is not involved during the query - regardless of whether you’re using the REST server or any other Peer.

The Transactor itself writes a heartbeat and its own address to storage. This enables HA - when a failover event occurs, the newly-active Transactor writes its address to storage and Peers are then able to reconnect to it after the failover event to submit transactions.

Additional details on HA failover can be found here:
http://docs.datomic.com/ha.html and http://docs.datomic.com/deployment.html#getting-reconnected

-Marshall