Properly handling java.lang.InterruptedException from Peer query

In our app we have some threads which, among other things, run queries using the Peer API. It is normal and expected that while managing these threads they will be interrupted, with the intention of a graceful shutdown. Our code catches java.lang.InterruptedException and handles it accordingly. If, however, we interrupt the thread while our code is blocking on the Peer query, datomic wraps the exception a couple times and our calling code sees just a raw java.lang.Exception. We could look for a root cause of InterruptedException to handle this case, but I’m wondering what is or isn’t guarenteed on the datomic side around this behavior and if we can rely on it not changing in the future. Obviously the API doesn’t include the checked InterruptedException even though some API down the line likely does declare it to be thrown (e.g. it looks like datomic is calling FutureTask.get())

Is there an “official” to handle InterruptedException that differentiates it from a true error with the query, communication issues, etc?

Here is a typical stack trace for this scenario:

java.lang.Exception: processing rule: (q__377756150 ?case), message: processing clause: {:argvars nil, :fn #object[datomic.datalog$expr_clause$fn__7085 0x5a312243 "datomic.datalog$expr_clause$fn__7085@5a312243"], :clause [(ground $__in__3) ?netInstanceId], :binds [?netInstanceId], :bind-type :scalar, :needs-source true}, message:
	at <our code calling query>
	at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.Exception: processing rule: (q__377756150 ?case), message: processing clause: {:argvars nil, :fn #object[datomic.datalog$expr_clause$fn__7085 0x5a312243 "datomic.datalog$expr_clause$fn__7085@5a312243"], :clause [(ground $__in__3) ?netInstanceId], :binds [?netInstanceId], :bind-type :scalar, :needs-source true}, message: 
	at datomic.datalog$eval_rule$fn__7218.invoke(datalog.clj:1471)
	at datomic.datalog$eval_rule.invokeStatic(datalog.clj:1451)
	at datomic.datalog$eval_rule.invoke(datalog.clj:1430)
	at datomic.datalog$eval_query.invokeStatic(datalog.clj:1494)
	at datomic.datalog$eval_query.invoke(datalog.clj:1477)
	at datomic.datalog$qsqr.invokeStatic(datalog.clj:1583)
	at datomic.datalog$qsqr.invoke(datalog.clj:1522)
	at datomic.datalog$qsqr.invokeStatic(datalog.clj:1540)
	at datomic.datalog$qsqr.invoke(datalog.clj:1522)
	at datomic.query$q_STAR_.invokeStatic(query.clj:727)
	at datomic.query$q_STAR_.invoke(query.clj:718)
	at datomic.query$q.invokeStatic(query.clj:750)
	at datomic.query$q.invoke(query.clj:747)
	at clojure.lang.Var.invoke(Var.java:383)
	at datomic.Peer.query(Peer.java:299)
	at com.acuitysds.services.tenant.ynet.ProdProto3Workflow.datomicEntityEventToCase(ProdProto3Workflow.java:241)
	at com.acuitysds.workflow.ynet.task.EventLogDrivenTask$EventLogFireable.earliestEvent(EventLogDrivenTask.java:104)
	at com.acuitysds.workflow.ynet.TaskInstance.earliestEvent(TaskInstance.java:255)
	at com.acuitysds.workflow.ynet.BasalNetInstance.earliestFiring(BasalNetInstance.java:138)
	at com.acuitysds.workflow.ynet.BasalNetInstance.lambda$evalTaskFirings$0(BasalNetInstance.java:282)
	at com.acuitysds.workflow.ynet.BasalNetInstance.evalTaskFirings(BasalNetInstance.java:290)
	... 3 common frames omitted
Caused by: java.lang.Exception: processing clause: {:argvars nil, :fn #object[datomic.datalog$expr_clause$fn__7085 0x5a312243 "datomic.datalog$expr_clause$fn__7085@5a312243"], :clause [(ground $__in__3) ?netInstanceId], :binds [?netInstanceId], :bind-type :scalar, :needs-source true}, message: 
	at datomic.datalog$eval_clause$fn__7192.invoke(datalog.clj:1417)
	at datomic.datalog$eval_clause.invokeStatic(datalog.clj:1380)
	at datomic.datalog$eval_clause.invoke(datalog.clj:1346)
	at datomic.datalog$eval_rule$fn__7218.invoke(datalog.clj:1466)
	... 23 common frames omitted
Caused by: java.lang.InterruptedException: null
	at java.util.concurrent.FutureTask.awaitDone(FutureTask.java:404)
	at java.util.concurrent.FutureTask.get(FutureTask.java:191)
	at clojure.core$deref_future.invokeStatic(core.clj:2208)
	at clojure.core$deref_future.invoke(core.clj:2206)
	at clojure.core$deref.invokeStatic(core.clj:2229)
	at clojure.core$deref.invoke(core.clj:2214)
	at clojure.core$mapv$fn__9444.invoke(core.clj:6627)
	at clojure.lang.PersistentVector.reduce(PersistentVector.java:341)
	at clojure.core$reduce.invokeStatic(core.clj:6544)
	at clojure.core$reduce.invoke(core.clj:6527)
	at clojure.core$mapv.invokeStatic(core.clj:6627)
	at clojure.core$mapv.invoke(core.clj:6618)
	at datomic.common$pooled_mapv.invokeStatic(common.clj:677)
	at datomic.common$pooled_mapv.invoke(common.clj:672)
	at datomic.datalog$qmapv.invokeStatic(datalog.clj:51)
	at datomic.datalog$qmapv.invoke(datalog.clj:46)
	at datomic.datalog$join_project_coll_with.invokeStatic(datalog.clj:224)
	at datomic.datalog$join_project_coll_with.invoke(datalog.clj:132)
	at datomic.datalog$fn__6657.invokeStatic(datalog.clj:236)
	at datomic.datalog$fn__6657.invoke(datalog.clj:228)
	at datomic.datalog$fn__6586$G__6560__6601.invoke(datalog.clj:64)
	at datomic.datalog.FnRel.join_project_with(datalog.clj:621)
	at datomic.datalog$join_project_coll.invokeStatic(datalog.clj:129)
	at datomic.datalog$join_project_coll.invoke(datalog.clj:127)
	at datomic.datalog$fn__6655.invokeStatic(datalog.clj:232)
	at datomic.datalog$fn__6655.invoke(datalog.clj:228)
	at datomic.datalog$fn__6565$G__6558__6580.invoke(datalog.clj:64)
	at datomic.datalog$eval_clause$fn__7192.invoke(datalog.clj:1390)
	... 26 common frames omitted

Hi @adam

Yes, you can look for a root cause InterruptedException as indicative of this case. I am interested in why you are terminating threads externally. Is there a particular business workflow or operation that you are using this technique for?

Thanks,
Jaret

Hi @jaret,

Great, thanks for clarifying.

Broadly speaking, we’re using threads for (user) session management. A thread gets created to do asynchronous processing needed to support a user session. It does thread things like sleep, block on IO, poll queues, issue datomic queries, etc. The outer loop of this thread does thread watchdog type stuff and supports “clean” termination (i.e. while(!shutdown) {}), but we’re also using thread interrupts to terminate any work (or blocking) the thread is doing as part of our normal session management (e.g. a session isn’t needed any longer and can be terminated, or it needs to be replaced for some reason).

The outer loop catches exceptions. InterruptedException in this context means a clean shutdown should occur. Other Exceptions indicate either some transient error or a bug and get logged as ERRORs before the thread decides if it should retry or die as a result.

The case I described above is what happens when the interrupt occurs while some code was blocking on query. While I guess I understand the rationale for InterruptedException being a checked exception in Java, in practice it would probably work out better if it had been unchecked. I wouldn’t expect datomic to add a queryInterruptably API, although maybe a queryAsync API returning a Future would provide a clean way to handle this case while also offering an external query cancelation mechanism? I suppose you might not want to get into the query queuing business.

No matter, we’re fine handling this case dynamically, as long as the contract is stable. It just seemed a bit strange that the stack traces have a few layers of “couldn’t do this internal thing” because “couldn’t do that internal thing” because “interrupted”. All those middle internal details made me a bit uneasy assuming I could rely on it.

Adam