A little while ago (at work), I ran into a situation where we needed to store around 100 rows of default data, allowing each tenant to add their own custom entries. Initially, it was set up with two tables:
- Default Data Table (
default_data
): This stored all default data without any tenant constraints. - Tenant-Specific Table (
tenant_data
): This stored tenant-specific data, including a copy of the default data, constrained by each tenant.
While this solution worked in theory, it caused several issues. The default data was duplicated across tenants, leading to data sprawl. Anytime a new tenant was added, the system copied all the default data into their dataset. This meant that if we ever needed to update or remove default data, we’d have to touch every tenant’s copy, which created unnecessary complexity and potential inconsistencies.
Problem: Management Overhead
The original design, which copied default data for each tenant, introduced a maintenance headaches. Although tenants couldn’t modify or delete default data, duplicating the same information for every tenant made managing updates cumbersome. Every time default data needed to be modified, it was spread across the system, making it harder to maintain consistency and scalability.
The Solution: Nullable Tenant Constraint
To solve this, I introduced a more efficient approach: removing the default_data
table and consolidating all data into the tenant_data
table. Instead of copying default data for every new tenant, we made the tenant constraint nullable. Here’s how the system works now:
- Single
tenant_data
Table: All data—whether it’s default or tenant-specific—is stored in thetenant_data
table. Default data is marked withis_default = 1
and has a nullable tenant constraint (tenant_id IS NULL
), while tenant-specific data hasis_default = 0
and is linked to a specific tenant viatenant_id
. - Querying Default and Tenant Data: When querying the table, we simply check for both tenant-specific data (using the logged-in tenant’s ID) and default data (where
tenant_id IS NULL
). For example:SELECT * FROM tenant_data WHERE tenant_id = <logged_in_tenant_id> OR tenant_id IS NULL
This ensures that each tenant sees both their own data and the shared default data without needing to copy or duplicate the default entries across tenants.
Why This Approach is More Efficient
By introducing a nullable tenant constraint, we’ve eliminated the need to duplicate default data across tenants. Now, default data exists only once, stored centrally in the tenant_data
table with tenant_id IS NULL
. This brings several key benefits:
- Simplified Management: Managing default data is much easier now. When new default data needs to be added or old data removed, we only need to update the centralized records. There’s no longer a need to worry about individual tenant copies of the default data.
- Future-Proofing: By eliminating data duplication and maintaining clear separation between default and tenant-specific data, we’re setting up the system for long-term scalability. As more tenants are added, there’s no additional burden of managing redundant copies of default data.
What Are Your Thoughts?
Have you ever faced similar challenges when managing default data in a multi-tenant app? How did you handle it? I’d love to hear about your thoughts on using a nullable tenant constraint and the is_default
flag. What solutions have worked for you in the past? Let’s discuss!
Leave a Reply