Does not-join support working with collection variables?

Hi, when I execute queries that compare against the collection variable inside not-join then the result looks like not-join is skipped from processing. The next steps will help to reproduce the case I mean

(require '[datomic.api :as d])
(d/create-database "datomic:dev://localhost:4334/not-join?password=datomic")
(def conn (d/connect "datomic:dev://localhost:4334/not-join?password=datomic"))

(def user-schema [
                   {
                       :db/ident :User/name
                       :db/valueType :db.type/string
                       :db/cardinality :db.cardinality/one
                       :db/unique :db.unique/value
                   }
                   ])
(d/transact conn user-schema)

(def user-data [
  { :User/name "User_1" }
  { :User/name "User_2" }
  { :User/name "User_3" }
])
(d/transact conn user-data)

Query 1
Firstly I want to show that the collection variable is treated correctly.
The query imitates SQL query:
name IN (“User_1”, “User_2”) or name = “User_1” OR name = “User_2”

(d/q '[:find
 ?user ?name
 :in
 $
 [?user_names ...]
 :where
 [?user :User/name ?user_names]
 [?user :User/name ?name]
]
(d/db conn) ["User_1" "User_2"])

#Result is: #{[17592186045418 "User_1"] [17592186045419 "User_2"]}

Query 2
What is equal to Query 1 but without using input variables.
If would want to use a filter on user’s names without passing an input variable then it could look like the result is like from the query above.

(d/q '[:find
 ?user ?name
 :in
 $
 :where
 (or-join [?user]
	 [?user :User/name "User_1"]
	 [?user :User/name "User_2"]
 )
 [?user :User/name ?name]
]
(d/db conn) ["User_1" "User_2"])

#Result is: #{[17592186045418 "User_1"] [17592186045419 "User_2"]}

Query 3
Inversion to Query 1 but for some reason, the result is not inversed.
To inverse the result, I used not-join which corresponds to the SQL queries
name NOT IN (“User_1”, “User_2”) or name != “User_1” AND name != “User_2”
But you will see that the result still contains the entities from Query 1.

(d/q '[:find
 ?user ?name
 :in
 $
 [?user_names ...]
 :where
 (not-join [?user ?user_names]
	[?user :User/name ?user_names]
 )
 [?user :User/name ?name]
]
(d/db conn) ["User_1" "User_2"])

# Result is: #{[17592186045420 "User_3"] [17592186045419 "User_2"] [17592186045418 "User_1"]}

Query 4
Inversion to Query 2 without using input variables.
And we see it works as expected but its analog Query 3 does not do the same

(d/q '[:find
 ?user ?name
 :in
 $
 [?user_names ...]
 :where
 (not-join [?user]
	 (or-join [?user]
		 [?user :User/name "User_1"]
		 [?user :User/name "User_2"]
	 )
 )
 [?user :User/name ?name]
]
(d/db conn) ["User_1" "User_2"])

# The result is: #{[17592186045420 "User_3"]}

Query 5
I noticed an interesting behavior if ?user_names contains only one element then it works correctly

(d/q '[:find
 ?user ?name
 :in
 $
 [?user_names ...]
 :where
 (not-join [?user ?user_names]
	[?user :User/name ?user_names]
 )
 [?user :User/name ?name]
]
(d/db conn) ["User_1"])

# Result is: #{[17592186045420 "User_3"] [17592186045419 "User_2"]}

Datomic on-prem: 1.0.6397

P.S. @jaret Might it be related to Not possible to pass more then one vector variable to or-join ?

Any help please

The way “not” works is it discards a row if the clauses in the not clause unify with anything (ie have any results). In this case, you destructured the the user names outside the not, so both user_names are tested against each name; since at least one of them matches your data, the not doesn’t discard anything.

To get the behavior you expect, you want to say that none of the names unify, which you can do like this:

:in $ ?user_names ;; don’t destructure!
:where
[?user :user/name ?name]
(not [(identity ?user_names) [?name …]])

Hi @favila, thanks for your answer, but I want to clarify:

since at least one of them matches your data, the not doesn’t discard anything

maybe you meant:

since at least one of them DOES NOT match your data, the not doesn’t discard anything

No, I meant what I said, but didn’t express myself well. Let me unpack it.

;; Setup
(def user-data
  [[1 :User/name "User_1"]
   [2 :User/name "User_2"]
   [3 :User/name "User_3"]])

(def user-names ["User_1" "User_2"])

This is the full cross-product of all your variables.

(To make it less confusing, I renamed your variable ?user_names to ?user_name, because it’s actually only one name, as you can see in the results.)

(d/q '[:find ?user ?name ?user_name
       :in $ [?user_name ...]
       :where
       [?user :User/name ?name]]
     user-data user-names)

; ?user ?name ?user_name
#{[1 "User_1" "User_1"]
  [1 "User_1" "User_2"]
  [2 "User_2" "User_1"]
  [2 "User_2" "User_2"]
  [3 "User_3" "User_1"]
  [3 "User_3" "User_2"]}

Any “not” is going to give us some subset of these rows, never add more rows.

Now lets add in your not-join.
The body of the not join is going to discard rows where the user has a :User/name matching ?user_name.

(d/q '[:find ?user ?name ?user_name
       :in $ [?user_name ...]
       :where
       [?user :User/name ?name]
       (not-join [?user ?user_name]
                 [?user :User/name ?user_name])]
     user-data user-names)

; ?user ?name ?user_name
#{
  ; [1 "User_1" "User_1"] ;; not-join body matched, so row discarded.
  [1 "User_1" "User_2"]
  [2 "User_2" "User_1"]
  ; [2 "User_2" "User_2"] ;; not-join body matched, so row discarded.
  [3 "User_3" "User_1"]
  [3 "User_3" "User_2"]}

You’ll note this doesn’t discard rows where ?name and ?user_name did not match, which is one row involving user 1, one row involving user 2, and both rows involving user 3.

When we ask for only ?user ?name, it’s just going to collapse the results to a set:

(d/q '[:find ?user ?name ?user_name
       :in $ [?user_name ...]
       :where
       [?user :User/name ?name]
       (not-join [?user ?user_name]
                 [?user :User/name ?user_name])]
     user-data user-names)

; ?user ?name
#{[1 "User_1"]
  [2 "User_2"]
  [3 "User_3"]}

These are the results you said you didn’t expect for your query 3; hopefully now you see why they make sense.

Your intuition here is incorrect:

To inverse the result, I used not-join which corresponds to the SQL queries
name NOT IN (“User_1”, “User_2”) or name != “User_1” AND name != “User_2”

It’s actually the same as sql NOT (name = "User_1" OR name = "User_2"), note “OR” not “AND”.

Suppose :User/name were cardinality-many and you add a fourth user that has both User_1 and User_2 as names; then it wouldn’t be in the result of the same query:

(d/q '[:find ?user ?name
       :in $ [?user_name ...]
       :where
       [?user :User/name ?name]
       (not-join [?user ?user_name]
                 [?user :User/name ?user_name])]
     user-data (conj user-names [4 :User/name "User_1"] [4 :User/name "User_2"]))


#{[1 "User_1"] [2 "User_2"] [3 "User_3"]}

Destructuring user_names inside the not-join works because then for a given row it is trying to unify all names against that row and discarding the row if anything unifies.
Lets see what that looks like.

First, the full result set:

(d/q '[:find ?user ?name ?user_names
       :in $ ?user_names ;; no destructuring
       :where
       [?user :User/name ?name]]
     user-data user-names)

#{[1 "User_1" ["User_1" "User_2"]]
  [2 "User_2" ["User_1" "User_2"]]
  [3 "User_3" ["User_1" "User_2"]]}

Then lets add our not

(d/q '[:find ?user ?name ?user_names
       :in $ ?user_names
       :where
       [?user :User/name ?name]
       (not-join [?user ?user_names] ; or just `not` is fine here
         [(identity ?user_names) [?name ...]])]
     user-data user-names)

#{[3 "User_3" ["User_1" "User_2"]]}

Then lets select only the columns we want

(d/q '[:find ?user ?name
       :in $ ?user_names
       :where
       [?user :User/name ?name]
       (not [(identity ?user_names) [?name ...]])]
     user-data user-names)
=> #{[3 "User_3"]}

Hopefully it is clearer now why you didn’t get the results you expected for query 3.

As a bonus, note you can also do this, and in my experience it’s sometimes faster when when user-names is large and there’s no way to use an index.

(d/q '[:find ?user ?name
       :in $ ?user_names_set
       :where
       [?user :User/name ?name]
       (not [(contains? ?user_names_set ?name)])]
     user-data (set user-names))
=> #{[3 "User_3"]}
1 Like

Hi @favila. I appreciate the help you provide me. I need time to investigate your answer to understand your example better, some things in your solution are new to me

Hi @favila . I want to say thank you again, it is the second time I have used your answer as a good explanation to solve a new issue I am working on fixing it now