Does query result order of entity IDs affect cache and segment read efficiency?

Hi Datomic team,

When a query returns a large set of entity IDs, I’ve noticed that the IDs are often numerically close to each other, but sometimes there are large gaps.
This made me wonder whether the order of entity IDs returned by a query is already optimized for storage access patterns, or if sorting them before fetching by :db/id could improve performance.

I was thinking about this in the sense that if entity IDs within a segment are stored in sorted order, then fetching data for sorted IDs might be more efficient.
Reading one segment could populate the cache with multiple adjacent entities, reducing additional storage reads (StorageGetBytes) and improving ObjectCache hit ratio.

On the other hand, if query results may include IDs that are far apart (spanning multiple segments), unsorted access could lead to less efficient segment loading, especially when the cache size is limited.

So the question is:

Does the query engine already return entity IDs in an order that maximizes segment locality, or could sorting entity IDs before lookups help improve cache and I/O efficiency?

Thanks!
Mykola

3 Likes