Multi-tenancy & databases


#1

Hi,

Question on multi-tenancy with Datomic – is it viable, and a good idea to provide a database to each customer on a saas app? or is it better to use partitions? Due some specific reasons, we are moving to the first option. But is it a good direction technically as well as cost-wise?

Any advise is appreciated. Thanks in advance.

-H


#2

Any response? Thanks in advance!


#3

The Datomic transactor is designed and intended to serve a single primary database. It is certainly OK to have one or a few secondary databases for ‘supporting’ use cases, but the indexing behavior of the transactor is based on the majority of the writes being against a single logical db.

You can certainly use partitions to improve the index locality of data on a per-customer (or per-group of customers) basis. This has been discussed previously, in particular using a fixed number of partitions and dividing customer data between them equally (i.e. with a modulo on the # of partitions).

If your architecture requirements necessitate multiple logical databases, we suggest running an individual transactor per active database.

-Marshall


#4

Hi Marshall, Thank you the advice. Happy to see the Datomic Cloud announcement!

We are planning to move to the Datomic Cloud. How does it apply to the Cloud model? Is it advisable to to create multiple databases (say, 10-20) per Solo Topology subscription (later, the production)? ie; (d/create-database client {:db-name “movies-1”}), (d/create-database client {:db-name “movies-2”}), and then (def conn1 (d/connect client {:db-name “movies-1”})) & (def conn2 (d/connect client {:db-name “movies2”})), and use those connections in the transactiions? Or the only option is to have partitions?

Thanks for the response.
-Hari


#5

Hari,

The one-db-per-transactor recommendation specifically applies to Datomic On-Prem.
Datomic Cloud uses a different architecture. Rich described some of those details this morning here: https://www.reddit.com/r/Clojure/comments/7r1748/datomic_cloud/dsx1s9b/

At any point in time one node is preferred to transact (per db, and can differ), and in normal operation all txes for a db will flow to/through that node. If for any reason (e.g. a temporary network partition) that node can’t be reached, any node can and will handle txes. Consistency is ensured by CAS at the DynamoDB level. This situation increases contention for DynamoDB and decreases throughput, so if the condition persists (or in the case where the preferred node disappears) a different node will become preferred. This is all immediate, there are no transfer/recovery intervals etc. Thus it is not like the mastership transfer and failover of Datomic On-Prem (and many other dbs). But neither should it be confused with parallel multi-writer (a la Cassandra).

An additional consequence of this architecture is that if you have a Production system with N nodes, the work of transacting against X databases will be split among those nodes.

Edit: We’ve now added this information to the Datomic Cloud docs: https://docs.datomic.com/cloud/operation/ha.html#how-ha-works


#6

Hi Marshall,

Thank you. That was helpful. My take on the last line “An additional consequence…” is that with a production/solo system, creating multiple databases from an app (on EC2), and using them with 1 or N nodes is okay technically. Obviously it might be limited by the cost ($) of the resources, right? Going forward, I can create any number of databases, and the scaling of the storage is carried out through DynamoDB. But the scaling of the computing side is done through addition of Nodes into the system, right?

Regarding ACID, the there is no concept of a transactor in the Cloud? Or the Node and the transactor are synonymous or the Transactor runs on a Node?

Just trying to remove the confusion in this area. Thanks again for the response. Looking forward to be on the Cloud!

Thanks
-Hari


#7

Hi Hari,

You’re correct in your assessment with regard to Production - adding more instances will scale compute of the system.

The Solo topology is limited to a single t2.small node, so you can’t scale it by adding instances.

As described in that section of the documentation, nodes in the transacting group are all capable of handling writes (transactions), and the ACID guarantees of Datomic are all fully preserved. We will be releasing Query Groups in the near term, which will allow you to have instances that serve read load only.

-M


#8

Hi Marshall, Thank you!

-Hari