What is Multi Tenant Architecture (“MTA” for short) and how can you use it to improve security and scalability of your infrastructure?


Multi Tenant Architecture isn’t anything new. Actually, it is a relatively old concept. Think back to the early 90s, before “The cloud” was big. Each customer (AKA “Tenant”) would host a standalone copy of the software they purchased/licensed on their own servers, keeping their data isolated from all the other customers.

Then the mass migration to “the cloud” started, and hosting providers would boot up standalone copies of this exact software on their servers, with each customer still having their own databases. Perhaps some of the bigger customers have their own standalone hardware to run on.

Who should use Multi Tenant Architecture?

This makes a lot of sense if each customer or “tenant” has their own data that never should cross over with other customers, but not a lot of sense for something like a social network where you want posts from user A to be seen by users B-Z.

Security:

The most obvious advantage to Multi Tenant Architecture is that it keeps each customer’s data separate from other customers, which is a huge security win. You wouldn’t want some JR dev forgetting to check the customer ID in a query and having customers get access to records they shouldn’t have access to.

Multi Tenant Architecture minimizes the chances of cross-contamination between accounts as each account has its own hardware or, at a minimum partition.

Scalability:

Using Multi Tenant Architecture allows you to have more granular control over the underlying hardware each tenant is assigned. Let's say you have a client who likes running massive, unoptimized queries that bring the system to a grinding halt. If you isolate them to their own hardware, then those queries will slow their system but have no effect on any other tenants.

This is great, not only for latency optimization but also to have more granular control over the cost you pay for the underlying infrastructure.

You can partition/shard by tenant as well, though there are some devils in the details as far as spreading out the partition keys equally.

Batching:

Lets say you have a batch job that runs at night and compiles a bunch of stats for each customer. If you have Multi Tenant Architecture, you can fire off a job for each tenant in parallel. Yes, this is more computing power running at the same time, but for shorter durations because they only have to process the data in the tenant they are assigned to.

This is really powerful when processing ever-growing data sets that have exponentially growing relationships.

Data Lake/Warehouse:

We live in the era of AI and big data, so what if you, with your customers' consent, wanted to use all of the tenant data to train a big AI model? Or do some big data query across multiple tenants? That is where data lakes and warehouses come into play. There is nothing stopping you from pumping every event from every tenant into a massive data lake like AWS Glue to do your cross tenant queries. I wouldn’t give your customers access to do this, as they could access each other's data, but for internal use, it can be quite useful.

Long story short Multi Tenant Architecture can be a powerful tool if your use case is the right fit.

If you are interested in learning more about real world battle tested strategies that can have a profound effect on your ability to cost-effectively scale your cloud infrastructure, then check out my free e-book 20 Things You Can Do Today To Save Money On Your Amazon Web Services Bill.