I have the schema which can be described by the next relations
Order 1->* User 1->1 Address, in EDN it is is described by the next attributes:
{
:db/ident :User/address
:db/valueType :db.type/ref
:db/isComponent true
:db/cardinality :db.cardinality/one
}
{
:db/ident :Address/zipCode
:db/valueType :db.type/string
:db/index true
:db/cardinality :db.cardinality/one
}
{
:db/ident :Order/users
:db/valueType :db.type/ref
:db/cardinality :db.cardinality/many
}
And the next original datalog query(in human text as get orders where users.address != 17592186093790 AND users.address.zipCode != “12345”):
[:find
[?Order ...]
:in $
:where
; The next two conditions make ?User_1 containing 2000 elements
[?User_1 :User/address ?User_address_compare_value_2]
[(!= ?User_address_compare_value_2 17592186093790)]
; The next three conditions make ?Address_4 containing 2000 elements
[?Address_4 :Address/zipCode ?Address_zipCode_compare_value_5]
[(!= ?Address_zipCode_compare_value_5 "12345")]
; I believe issue is in the next condition
[?User_1 :User/address ?Address_4]
[?Order :Order/users ?User_1]
]
This query is fast executed for small bunch of data but when for example there are 4000 orders where each one has link to 5 of 2000 users and each user has own address(so there are 2000 addresses) then out of memory happens. I believe the issue happens during executing the next condition [?User_1 :User/address ?Address_4] which does joining between ?User_1 and ?Address_4. The current workaround is to separate the above query into two different queries and intersect results of both queries:
(def orders1 (d/q '[:find
[?Order ...]
:in $
:where
[?User_1 :User/address ?User_address_compare_value_2]
[(!= ?User_address_compare_value_2 17592186093790)]
[?Order :Order/users ?User_1]
]
(d/db conn)))
(def orders2 (d/q '[:find
[?Order ...]
:in $
:where
[?Address_4 :Address/zipCode ?Address_zipCode_compare_value_5]
[(!= ?Address_zipCode_compare_value_5 "12345")]
[?User_1 :User/address ?Address_4]
[?Order :Order/users ?User_1]
]
(d/db conn))
)
(clojure.set/intersection (set orders1) (set orders2))
In such way the result is gotten in the twinkling of an eye.
The question: Am I doing something in wrong way in original query? Can this issue be solved still getting result by executing single query?