Dear team,
with Datomic version - 0.9.6014 + PG11
Previously we were setup a peer running with 2 transactors (1 primary + 1 standby) connecting to PostgreSQL, everything goes well with the peer.
Recently, we were about to setup a new parallel peer instance with the same transactors, and we meet this: it was stuck long (few minutes) for trying to establish the datomic connection -
2021-04-28 09:07:06.087 [main] INFO default xxx.meta-db - initializing META conn 2021-04-28 09:07:06.100 [main] INFO default xxx.config - Connecting to [xxx] , port: [xxx] . 2021-04-28 09:07:06.460 [main] ERROR default - Failed to resolve default logging config file: config/java.util.logging.properties 2021-04-28 09:11:07.329 [clojure-agent-send-off-pool-2] INFO default datomic.kv-cluster - {:event :kv-cluster/get-pod, :pod-key "pod-catalog", :phase :begin, :pid 18757, :tid 63} 2021-04-28 09:11:07.379 [clojure-agent-send-off-pool-2] DEBUG default datomic.kv-cluster - {:event :kv-cluster/pod-size, :bytes 229, :chunks 1, :pid 18757, :tid 63} 2021-04-28 09:11:07.380 [clojure-agent-send-off-pool-2] INFO default datomic.kv-cluster - {:event :kv-cluster/get-pod, :pod-key "pod-catalog", :msec 49.7, :phase :end, :pid 18757, :tid 63} 2021-04-28 09:11:07.392 [main] INFO default datomic.domain - {:event :cache/create, :cache-bytes 7635730432, :pid 18757, :tid 1} 2021-04-28 09:11:07.397 [main] INFO default datomic.process-monitor - {:event :metrics/initializing, :metricsCallback clojure.core/identity, :phase :begin, :pid 18757, :tid 1}
2021-04-28 07:37:34.041 [main] INFO default xxx.config - Connecting to [xxx] , port: [xxx] . 2021-04-28 07:37:34.347 [main] DEBUG default datomic.peer - {:event :peer/transactor-admin-request, :cluster {:protocol :sql, :db-name "META", :system-root "jdbc:postgresql://<uri>"}, :request :create-database, :arg {:db-name "META"}, :phase :begin, :pid 19412, :tid 1} 2021-04-28 07:37:34.405 [main] ERROR default - Failed to resolve default logging config file: config/java.util.logging.properties 2021-04-28 07:41:35.012 [main] DEBUG default datomic.coordination - {:event :coord/lookup-endpoint, :k "pod-coord", :phase :begin, :pid 19412, :tid 1} 2021-04-28 07:41:35.016 [clojure-agent-send-off-pool-2] DEBUG default datomic.kv-cluster - {:event :kv-cluster/get-ref, :ref-key "pod-coord", :phase :begin, :pid 19412, :tid 63} 2021-04-28 07:41:35.059 [clojure-agent-send-off-pool-2] DEBUG default datomic.kv-cluster - {:event :kv-cluster/get-ref, :ref-key "pod-coord", :msec 41.9, :phase :end, :pid 19412, :tid 63} 2021-04-28 07:41:35.060 [main] DEBUG default datomic.coordination - {:event :coord/lookup-endpoint, :k "pod-coord", :msec 47.2, :phase :end, :pid 19412, :tid 1} 2021-04-28 07:41:35.061 [main] DEBUG default datomic.connector - {:event :peer/hornet-connect, :host "gbl13950.systems.uk.hsbc", :pid 19412, :tid 1} 2021-04-28 07:41:35.137 [main] DEBUG default i.n.u.i.l.InternalLoggerFactory - Using SLF4J as the default logging framework
From jstack during the stuck, we could see that the application do stuck on the datomic connect
"main" #1 prio=5 os_prio=0 tid=0x00007f72d400a800 nid=0x542e runnable [0x00007f72d8400000] java.lang.Thread.State: RUNNABLE at java.net.SocketInputStream.socketRead0(Native Method) at java.net.SocketInputStream.socketRead(SocketInputStream.java:116) at java.net.SocketInputStream.read(SocketInputStream.java:171) at java.net.SocketInputStream.read(SocketInputStream.java:141) at org.postgresql.core.VisibleBufferedInputStream.readMore(VisibleBufferedInputStream.java:143) at org.postgresql.core.VisibleBufferedInputStream.ensureBytes(VisibleBufferedInputStream.java:112) at org.postgresql.core.VisibleBufferedInputStream.read(VisibleBufferedInputStream.java:71) at org.postgresql.core.PGStream.ReceiveChar(PGStream.java:282) at org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:1718) at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:173) - locked <0x00000006877c1dd0> (a org.postgresql.core.v3.QueryExecutorImpl) at org.postgresql.core.SetupQueryRunner.run(SetupQueryRunner.java:74) at org.postgresql.core.v3.ConnectionFactoryImpl.runInitialQueries(ConnectionFactoryImpl.java:763) at org.postgresql.core.v3.ConnectionFactoryImpl.openConnectionImpl(ConnectionFactoryImpl.java:224) at org.postgresql.core.ConnectionFactory.openConnection(ConnectionFactory.java:65) at org.postgresql.jdbc2.AbstractJdbc2Connection.<init>(AbstractJdbc2Connection.java:149) at org.postgresql.jdbc3.AbstractJdbc3Connection.<init>(AbstractJdbc3Connection.java:35) at org.postgresql.jdbc3g.AbstractJdbc3gConnection.<init>(AbstractJdbc3gConnection.java:22) at org.postgresql.jdbc4.AbstractJdbc4Connection.<init>(AbstractJdbc4Connection.java:47) at org.postgresql.jdbc4.Jdbc4Connection.<init>(Jdbc4Connection.java:30) at org.postgresql.Driver.makeConnection(Driver.java:414) at org.postgresql.Driver.connect(Driver.java:282) at org.apache.tomcat.jdbc.pool.PooledConnection.connectUsingDriver(PooledConnection.java:266) at org.apache.tomcat.jdbc.pool.PooledConnection.connect(PooledConnection.java:175) at org.apache.tomcat.jdbc.pool.ConnectionPool.createConnection(ConnectionPool.java:684) at org.apache.tomcat.jdbc.pool.ConnectionPool.borrowConnection(ConnectionPool.java:616) at org.apache.tomcat.jdbc.pool.ConnectionPool.init(ConnectionPool.java:479) at org.apache.tomcat.jdbc.pool.ConnectionPool.<init>(ConnectionPool.java:135) at org.apache.tomcat.jdbc.pool.DataSourceProxy.pCreatePool(DataSourceProxy.java:114) - locked <0x0000000687150440> (a org.apache.tomcat.jdbc.pool.DataSource) at org.apache.tomcat.jdbc.pool.DataSourceProxy.createPool(DataSourceProxy.java:101) at org.apache.tomcat.jdbc.pool.DataSourceProxy.getConnection(DataSourceProxy.java:125) at datomic.sql$connect.invokeStatic(sql.clj:16)
-Transactor properties-
memory-index-threshold=256m
memory-index-max=4g
index-parallelism=8
object-cache-max=2g