What are the benefits and drawbacks of using the :db.type/uri value type?
Would you store email addresses as URIs? (like mailto:x@y.z?)
I can see the following pros:
validation and encoding of the valid characters for the parts of URIs. (not sure what’s a good realistic example for this case though)
and the following drawbacks:
lack of official literal syntax, so URI values doesn’t appear as “just data”
getting the “meat” of an email address would require something like (.getSchemeSpecificPart (URI. "mailto:x@y.z")), which is what’s required in most external systems.
Are there any runtime costs or savings using URIs in queries?
I tried and I can define a URI attribute as :db.unique/identity:
I think you have laid out the tradeoffs correctly.
Internally datomic (really fressian) stores a URI as a tagged value with the “uri” tag plus the string representation of the uri, so this is only a few bytes larger than just the string itself.
In the Java heap a URI object has many more string fields (one for each URI part plus one for the whole URI) so it definitely has more memory overhead than a raw string, but this probably doesn’t matter in practice.
I think this comes down to whether you want an actual URI type flowing through your application stack (maybe including adding a tag-reader and printer for URIs and edn/transit/nippy/whatever handlers) or if you prefer validating+encoding at the edges of the Clojure process and keeping it a URI internally, or if you don’t care about URIs at all and want everything “stringly-typed”.
You can use attribute predicates to validate URIs but keep storing it as a string. URI validation IME always ends up being use-case-specific because it’s common not to follow the spec exactly, or you only want some subset of legal URIs (E.g. specific schemas, length-limits, valid email domains, rejecting easy-to-abuse characters, etc).