Here is a log from #datomic this morning
pvillegas12: Looking at schema design https://docs.datomic.com/cloud/best.html#group-related-attributes, it looks like the best practice is to namespace attributes. However, is it not better to leverage the universal schema (specially with spec in mind)? Why would you prefer :release/name
, :artist/name
, and :movie/name
, over something generic like :model/name
where you would define one spec instead of 3 different ones? Want to hear the tradeoffs between both approaches (1. Namespace all attributes pertaining to an entity in your business domain (explosion of attributes), 2. Leverage the universal schema to have shared attributes through your entities) (edited)
Bart: I don’t know spec yet but I do try to use generic attributes where I can, but curious how would you use spec here?
spect would check the entity you are referring to? like, this model is release, artist, etc…
dustingetz: :facebook/person-name has different validation rules than :linkedin/person-name. If they have the same semantics and validation rules, then use the same attribute. That way they can also share code that implements some semantics. Another example is :commonmark/markdown vs :github/markdown
dustingetz: I think attributes that originate from the same system (which is pretty much most attributes) generally benefit from sharing semantics. For example, :amazon/product-title is probably useful on both books and electronics, but :amazon/isbn is not. If you’re just tagging stuff with a string for display to the user, i see benefit from having the same attribute, and see pointless cost from differing it. In the future you can & will change this stuff anyway
pvillegas12: A downside I see from going universal schema with all attributes is as follows: I created a company entity which had address information and business information. All of these attributes are shared so they don’t have the :company/ prefix. When I’m going to query for all companies in datalog I have no way of specifying an attribute which is exclusively company based.
Another area of interest is that of refs. If I have an attribute that references let’s say an invoice, why would I not always want an :invoiceable/invoice attribute vs a :company/invoice or :inventory/invoice?
benoit: @pvillegas12 It’s a hard question and very much related to what computation your system makes in my opinion. It’s about the abstractions in your system and how you decompose the computation. Choosing whether you share a set of attributes across entity types is a bit like choosing to define a clojure protocol that will be implemented by different types. To take the invoice example, there is no reason to multiple attributes like :company/invoice
, :person/invoice
… Or if you have the inverse relationship, you would have :invoice/company
and :invoice/person
. But the recipient of an invoice should be abstracted (maybe with something like :invoice/recipient
) and then you can refer to any type of entity with it.
ro6: I thought about this tension a lot when I came on board with Datomic and Spec. I think the wise choice is to use small, specific, sufficiently namespaced names for stuff in durable storage. There are mechanisms available for adding and changing abstractions over time, but un-abstracting (ie separating things you once tried to treat as functionally equal) is much harder, especially when your abstractions are reified in the data model. I think this topic deserves a book. The stuff that helped me sort it out was Zach Tellman’s writing about abstraction in “Elements of Clojure” and “Data and Reality” by William Kent. Also, Rich’s talks about Spec and namespaces make it pretty clear that the idea is to nail down enduring semantics and meaning at the attribute level. By being overly broad, something like :entity/name
actually gives you less useful information.View newer replies
benoit: I think it helps to think in terms of relationships and protocols rather than types.
ncg: I would also recommend staying specific. Rules can help with implementing something like a generic notion of user, sourced from various specific user attributes in different namespaces.
Attributes are the granularity at which you can define all your important semantics. And as ro6 says, generalizing later is easier than specializing later (an advantage of the universal relation over tables in the first place).
(there are also indexing benefits to using specific namespaces, but that’s an implementation detail)
dustingetz: @pvillegas12 a company-specific entity needs only one company-specific attribute, it doesnt have to be all of them
pvillegas12: @dustingetz in this case all attributes are shared!
lilactown: some things might have similar attributes but semantically they are different
benoit: I don’t think it makes sense to go all shared or all type-specific. If you identified abstractions in your system then share attributes for those. If you haven’t then don’t share attributes. If you have 10 entity types sharing all the same attributes but you still need to be able to distinguish each type, nothing prevent you to have an attribute like :entity/type
to indicate the type. But most often some entity types will be involved in relationships and others won’t so you often have different attribute namespaces.
lilactown: e.g. I would differentiate between a :person/name
and a :company/name
, because they are semantically different things
dustingetz: Consider also :company/address vs :natzip4/address (http://www.zipinfo.com/products/natzip4/natzip4.htm) – natzip4 is far more semantic and yet flexible enough to be decoupled from company
benoit: @lilactown It all depends on your system. If you manage invoice and you don’t care wether your invoice is for a person or company then they might share the same attribute. It is hard to have discussions like this without actual system requirements
lilactown: right. I think a safe default is to be as specific as possible
it’s easy to assoc a new attribute to an entity that’s more general. it’s harder to sort your data and make it more specific after the fact
benoit: I hear that advice often but that’s not my experience. I encountered more systems fail because of an explosion of complexity due to special cases rather than a bad design. I still think it’s better to have a bad plan than no plan at all.
benoit: That said I have also seen systems where a rigid type system was put on top of datomic and prevented this kind of mix and match of namespaced attributes. Each entity could be of only one type. That made me sad. So you can definitely shoot yourself completely in the foot with bad abstractions. Especially if you artificially restrict power for no good reasons.