Attribute naming conventions

tomc · February 11, 2019, 3:15pm

There doesn’t seem to be much available online regarding datomic attribute naming conventions. I started thinking about this because I realized I was using - and . inconsistently in my attribute names. I’m hoping to find out what people think is the best convention.

I find that in real-world databases you end up with more nested names than those found in the example schemas. For example, in an application for making surveys, we have this attribute: :question.contact-info.channel/validation. “Contact Info” is a type of question that allows specifying different contact channels (like email, phone, etc) and input validation for each channel. So this naming convention is basically class.sub-class.inner-class/attribute. I’m using “subclass” and “inner class” to mean roughly what they do in java. This convention is imperfect because the separator for subclasses is the same as the one for inner classes. The ambiguity can be removed by introducing a different separator character: class#sub-class.inner-class/attribute.

To more thoroughly specify this convention:

- is a word separator, like in clojure var names.
. indicates nesting/“inner classes”.
# indicates subtypes.

I haven’t actually started using this convention yet, but it seems like an improvement over the ad hoc approach I’ve been using. Thoughts?

dustingetz · February 11, 2019, 4:05pm

Hi Tom
I’ve been thinking a lot about this recently. I think about this very differently than you so maybe you will find it interesting.

My current reasoning is:

meaning is given by names, not namespaces
namespaces are for solving name collisions
namespaces are not for classifying things into hierarchies

So is :question.contact-info.channel/validation really any more meaningful than :surveymonkey.survey/contact-channel-validation1? (Or something)

The clojure/core namespace is a great example. We have clojure.core/assoc, clojure.core/cond, clojure.core/defonce, etc. If we were to classify these things, they would be at very different places in the taxonomy tree. But as a shared vocabulary, they don’t collide and we just refer to them as “clojure/core” and it all works out.

Datomic.api is another example. d/q, d/transact, d/datoms, d/remove-tx-report-queue share a namespace.

Much more important I think is a name’s provenance – it’s origin, which project made it first, which team maintains it. But its okay to use an attribute in many projects or many contexts, without aliasing it into some new naming convention.

Even if we limit to one or two segment depth (say :organization.provenance for large multi-org projects that process data from many APIs) there is an infinite space of namespace names to choose from.

tomc · February 11, 2019, 4:34pm

I haven’t thought about datomic attributes in this way because I’ve never needed to combine my db with another one; naming collisions haven’t been a problem for me yet. If we assume your three bullet points, the part where a convention is helpful moves to the right of the /. Instead of :question#contact-info.channel/validation, the attribute could be something like :my.org/question#contact-info.channel.validation. In terms of conveying information, that’s better because now the attribute tells us where it comes from, but in terms of ergonomics it’s worse because the name is huge.

dustingetz · February 11, 2019, 6:35pm

Well for example :db.cardinality/one could flatten to just :db/one, without (in my opinion) confusing anyone. Eliding the hierarchy is not going to cause anybody to accidentally write {:db/unique :db/one}. And you can still validate it with a spec.

tomc · February 13, 2019, 3:21pm

Shortening :db.cardinality/one to :db/one might not confuse people, but it does leave more room for future naming conflicts and it does convey less information. My goals with defining a naming convention are to prevent naming conflicts, have an obvious route from concept to keyword, and make it obvious later what existing names mean and where they should be used. I’m less concerned with where the convention lives (in the namespace, name, or both) and more concerned with achieving those goals.

How do you use attribute namespaces in your datomic applications? For hyperfiddle are they all just :hyperfiddle/whatever-attribute? Do you put any heirarchy information into the attribute name? If not, how do you avoid conflicts and how do you make sure the meaning of a name is obvious?

dustingetz · February 14, 2019, 12:13am

We always write docstrings which helps document the intended semantics of shortnames of the form :db/one.

I have no good answer for picking good names.

Our core schema is pretty small but we kind of screwed this up. We have a lot of :hyperfiddle.x.y.z/foo and it is hard to remember. Also things get refactored and then namespaces stop making sense. I didn’t know about :db/ident aliases until recently so that will help.

In the future I’d like to move towards

:hyperfiddle/ core things
:hfnet/ cloud only services
:hfinc/ business data

Like why is :hfnet.user/primary-email better than :hfnet/primary-email? If it has a primary-email it’s gotta be an upsert to a user, right? No clear answer here

A pro of the tradeoff is we often use the same attribute in many “classes”, e.g.

:hyperfiddle/owner
:hyperfiddle/markdown

We specialize things like :hyperfiddle/fiddle-ident (:db.unique/identity) (over just :db/ident) because it takes a role similar to “entity types”.

This is all just my opinion and definitely out of sync with examples I’ve seen from Cognitect.

benfle · February 14, 2019, 11:31pm

I would try to be careful to not map too directly a Datomic schema with the idea of class (type). One of the interesting aspect of the Datomic information model is that it allows to think in terms of composable sets of attributes rather than types. An entity is not restricted to one type but can have attributes from different sets. One example is the Codeq schema: https://github.com/downloads/Datomic/codeq/codeq.pdf See how the namespaces group a set of related attributes together and how entities mix those attributes from different namespaces?

dustingetz · February 20, 2019, 9:02pm

Here is a log from #datomic this morning

pvillegas12: Looking at schema design https://docs.datomic.com/cloud/best.html#group-related-attributes, it looks like the best practice is to namespace attributes. However, is it not better to leverage the universal schema (specially with spec in mind)? Why would you prefer :release/name , :artist/name , and :movie/name , over something generic like :model/name where you would define one spec instead of 3 different ones? Want to hear the tradeoffs between both approaches (1. Namespace all attributes pertaining to an entity in your business domain (explosion of attributes), 2. Leverage the universal schema to have shared attributes through your entities) (edited)

Bart: I don’t know spec yet but I do try to use generic attributes where I can, but curious how would you use spec here?
spect would check the entity you are referring to? like, this model is release, artist, etc…

dustingetz: :facebook/person-name has different validation rules than :linkedin/person-name. If they have the same semantics and validation rules, then use the same attribute. That way they can also share code that implements some semantics. Another example is :commonmark/markdown vs :github/markdown

dustingetz: I think attributes that originate from the same system (which is pretty much most attributes) generally benefit from sharing semantics. For example, :amazon/product-title is probably useful on both books and electronics, but :amazon/isbn is not. If you’re just tagging stuff with a string for display to the user, i see benefit from having the same attribute, and see pointless cost from differing it. In the future you can & will change this stuff anyway

pvillegas12: A downside I see from going universal schema with all attributes is as follows: I created a company entity which had address information and business information. All of these attributes are shared so they don’t have the :company/ prefix. When I’m going to query for all companies in datalog I have no way of specifying an attribute which is exclusively company based.
Another area of interest is that of refs. If I have an attribute that references let’s say an invoice, why would I not always want an :invoiceable/invoice attribute vs a :company/invoice or :inventory/invoice?

benoit: @pvillegas12 It’s a hard question and very much related to what computation your system makes in my opinion. It’s about the abstractions in your system and how you decompose the computation. Choosing whether you share a set of attributes across entity types is a bit like choosing to define a clojure protocol that will be implemented by different types. To take the invoice example, there is no reason to multiple attributes like :company/invoice , :person/invoice … Or if you have the inverse relationship, you would have :invoice/company and :invoice/person . But the recipient of an invoice should be abstracted (maybe with something like :invoice/recipient ) and then you can refer to any type of entity with it.

ro6: I thought about this tension a lot when I came on board with Datomic and Spec. I think the wise choice is to use small, specific, sufficiently namespaced names for stuff in durable storage. There are mechanisms available for adding and changing abstractions over time, but un-abstracting (ie separating things you once tried to treat as functionally equal) is much harder, especially when your abstractions are reified in the data model. I think this topic deserves a book. The stuff that helped me sort it out was Zach Tellman’s writing about abstraction in “Elements of Clojure” and “Data and Reality” by William Kent. Also, Rich’s talks about Spec and namespaces make it pretty clear that the idea is to nail down enduring semantics and meaning at the attribute level. By being overly broad, something like :entity/name actually gives you less useful information.View newer replies

benoit: I think it helps to think in terms of relationships and protocols rather than types.

ncg: I would also recommend staying specific. Rules can help with implementing something like a generic notion of user, sourced from various specific user attributes in different namespaces.
Attributes are the granularity at which you can define all your important semantics. And as ro6 says, generalizing later is easier than specializing later (an advantage of the universal relation over tables in the first place).
(there are also indexing benefits to using specific namespaces, but that’s an implementation detail)

dustingetz: @pvillegas12 a company-specific entity needs only one company-specific attribute, it doesnt have to be all of them

pvillegas12: @dustingetz in this case all attributes are shared!

lilactown: some things might have similar attributes but semantically they are different

benoit: I don’t think it makes sense to go all shared or all type-specific. If you identified abstractions in your system then share attributes for those. If you haven’t then don’t share attributes. If you have 10 entity types sharing all the same attributes but you still need to be able to distinguish each type, nothing prevent you to have an attribute like :entity/type to indicate the type. But most often some entity types will be involved in relationships and others won’t so you often have different attribute namespaces.

lilactown: e.g. I would differentiate between a :person/name and a :company/name , because they are semantically different things

dustingetz: Consider also :company/address vs :natzip4/address (http://www.zipinfo.com/products/natzip4/natzip4.htm) – natzip4 is far more semantic and yet flexible enough to be decoupled from company

benoit: @lilactown It all depends on your system. If you manage invoice and you don’t care wether your invoice is for a person or company then they might share the same attribute. It is hard to have discussions like this without actual system requirements

lilactown: right. I think a safe default is to be as specific as possible
it’s easy to assoc a new attribute to an entity that’s more general. it’s harder to sort your data and make it more specific after the fact

benoit: I hear that advice often but that’s not my experience. I encountered more systems fail because of an explosion of complexity due to special cases rather than a bad design. I still think it’s better to have a bad plan than no plan at all.

benoit: That said I have also seen systems where a rigid type system was put on top of datomic and prevented this kind of mix and match of namespaced attributes. Each entity could be of only one type. That made me sad. So you can definitely shoot yourself completely in the foot with bad abstractions. Especially if you artificially restrict power for no good reasons.

tomc · February 26, 2019, 10:55pm

Thanks for sharing that log, Dustin.

I lean more toward the “more specific” side than the “more general” side with my attribute naming choices, though I do have plenty of attributes in an entity namespace meant for use on any entity.

Placing a rigid type system on top of Datomic is very different from considering the type of entity an attribute will generally be used with and including that information in the attribute’s name - this is supported by the best practices post about grouping related attributes. And if there comes a time that an attribute name indicates a narrower scope than its actual usage, renaming to something more general is easy, as mentioned in the chat log. At this point I still think my imperfect convention suggested in the first post is better than no convention at all. I’ll report back later if my thinking evolves on this.

dustingetz · February 27, 2019, 1:37am

Yeah I also thought the idea of narrow gradually becoming more general was compelling.

dustingetz · March 1, 2019, 5:40pm

This is an interesting example from RDF: http://xmlns.com/foaf/spec/

So one thing that i see is foaf:topic_interest has docstring that says “Domain: having this property implies being a foaf:Agent” whereas foaf:name says “Domain: having this property implies being a owl:Thing” (making the attribute applicable anywhere). Not sure of the implications of inheritance here or how important it is. Clojure has (derive ::Cat ::Feline) but I have never seen that in the wild.

Topic		Replies	Views
Datomic Cloud 981-9188 Announcements	2	795	December 3, 2022
Best way to model the equivalent of join tables or cross reference tables in Datomic Datomic Applications	0	839	May 13, 2021
Is it legal to reuse aliases for other attributes? Datomic Pro	1	723	March 17, 2020
Datomic Cloud DB name with slash causes exception Datomic Cloud	4	920	November 8, 2018
Clarifying how to query and pull with tuples General	1	1422	May 4, 2021

Attribute naming conventions

Related topics