Enums vs Keywords

I’m looking at the value of enums vs keywords and during my search I found a discussion on the previous forum [1] asking the same question. It’s been years since that discussion so I’m suggesting an updated discussion happen here. Has there been any consensus on this? Would anyone like to take a stab at answering Antons last question? [2]. I’m leaning towards keywords simply because it allows me to normalize my data such that moving from one db system to another is easier [3], however I’d like to make sure I’m not missing out on tangible value adds.

Any advice would be great.
Thanks.

  1. https://groups.google.com/forum/#!topic/datomic/KI-kbOsQQbU

  2. Are entity ids actually smaller than keywords? Has anyone done any benchmarking of the in-memory footprint or speed?

  3. My app is designed to allow customers to pick their preferred database.

1 Like

Hi Tim,

Here is how I see it:

  • A keyword is a value;
  • An enum is used to reference a datomic entity by a name, as a semantic identifier.

So, the pros and cons are essentially the same as comparing the pros and cons of using a value like an integer or a string, vs a datomic entity. A value is simple, but can’t be evolved into a complex data structure. A datomic entity is complex, and can be evolved later.

We use enums only to reference a datomic entity from code easily. We could as well structure our data a bit more and use proper queries to find those entities (without using the name). The datalog engine is nice because it resolves enums for us… at the end of the day those are datomic entities; keywords are not.

Best,
Cam

:db/ident gives you hardcoded constants. The docs have a caveat:

Idents should not be used as unique names or ids on ordinary domain entities. Such entity names should be implemented with a domain-specific attribute that is a unique identity.

I had been using db/ident for constants available in code, but now I need to rethink if that is okay given above.

We almost exclusively use pull now, so the entity syntax sugar (to resolve as a keyword) isn’t in play anymore, which has been for the better for us. But our confusion here may be because we put db/ident on too many userland entities.

Well yes, but an entity id is also a value which is stored when using that enum mechanism.

The docs suggest this is best the way to handle enums:

Many databases need to represent a set of enumerated values. In Datomic, it’s idiomatic to represent enumerated values as entities with a :db/ident attribute.

Yet it may not actually be as efficient to do so; And if it’s not then we are increasing complexity:

  • Adding new enums. Using enum aliasing you need to update the schema for each new entry.
  • While you can now plop in the enum keyword into a query you still need special handling to query that value vs. a standard value. And when you resolve ‘ref’ entities (map refs not enum refs) you need special handling to resolve any contained enum entities to make use of them. This can be cumbersome when you write generic handlers.

So the question at hand is: Is it really better? Is storing entity id’s in each value place instead of just a keyword in each place. If it’s not then maybe the guides should not position it as being idiomatic. It would seem strange to direct every new user into that mechanism if most are not using it for enumerated values.

P.S. Thanks for your response a it’s helping me work through it. For now I’m going to ignore the guides suggestion, unless there’s a good enough reason (which I haven’t found).

If it helps, consider that attribute db/idents I don’t think are anything special, they just get you lookup ref syntax for your queries [?e :community/name ?n] instead of [?e 63 ?n]; so maybe think of db/idents as enumerated constants for your queries?

(->> (d/q '[:find (pull ?e [:community/name])
            :where 
            [?e :community/type :community.type/wiki]]
          $))
=>
[[#:community{:name "Capitol Hill Community Council"}] ...
(d/touch (d/entity $ :community/name))
=> #:db{:id 63, :ident :community/name, :valueType :db.type/string, :cardinality :db.cardinality/one}

Sorry to resurrect and old thread but I have found a failing for enumerated types, or else I just can’t seem to find the right trick to get it to work.
In our system we uses enumerated types to build state machine statuses and we like to use :db/fn :cas to properly enforce valid transitions.
Well this code doesn’t work

`(d/transact conn [[:db.fn/cas [:order/id 1] :order/status :order.status/ordered :order.status/received]])`

It seems that the next status can be left in keyword form but the old status must be the actual :db/id of the enumerated type. It’s kind of a bummer to have to read or cache the ref ids once a connection is acquired in order to deal with these kind of idiosyncrasies.

As a workaround you could write your own cas function on the transactor classpath that coerces the inputs

Oh there are plenty of workarounds, but for me this is a bug. The following code should work on some variant:
(d/transact conn [[:db.fn/cas [:order/id 1] :order/status [:db/ident :order.status/ordered] :order.status/received)
A transactor function, calling d/attr before every transact, all somewhat clumsy rather than this weird asymmetry when dealing with enums.

Thanks for chiming in on such an old issue though :wink: