Large Number of Database

I’ve read the Multi-tenancy & databases thread, but it didn’t quite answer my specific question.

I’ve got an application that needs to scale to millions of users. Each user will be creating one or more “environments” that could be encapsulated cleanly in a stand-alone database whose overall size and write load will, in isolation, be small (maybe 5-20 small tx/minute peak). In aggregate the write load would probably be quite prohibitive (70k/s) for a single transactor, but trivial if divided down to a database per environment. The overall size of these instances might grow to something on the order of 1M datoms over time.

I understand that each database creates some level of overhead, but I cannot find any specific numbers that give an indication of the relative size of that overhead for cloud deployments.

What is a “realistic” number of these “small databases” that could run in Datomic cloud on a pair of i3large instances, for example, without most of the resources being used by the overhead itself?

3 Likes

Hi Tony,

The number of individual databases that can be supported on a single system will vary depending on the size/load profile of the databases, the size of the instances, and the size of the primary compute group.

I can definitively say that millions is out of the question. Generally, I would consider a multi-tenancy of 10 or so databases to likely be fine, with as many as a fifty or a hundred if they’re quite small and infrequently used.
I would strongly recommend testing your system with realistic load and database numbers to determine at what level the overhead of your particular use pattern becomes an issue.

In 9/2022 I got this answer from Datomic support:

Multi-tenancy in Cloud is fully supported and you can have 100s to thousands of separate DBs on a Datomic cloud system. There are still operational impacts to having so many DBs but you can scale compute nodes to optimize performance, utilize query groups to offload read per DB and have the ability to scale.

It would be awesome if Cognitect did some study and provided more detailed guidance on using Datomic in SaaS / multi-tenancy setups. Concrete numbers and such. Clearly there is not enough information and experience here :cry: