A Quiet Tax for the Secure Row
Investigating how Postgres Row-Level Security (RLS) policies can silently sabotage the query planner and force expensive sequential scans on otherwise indexed tables.
A Quiet Tax for the Secure Row
You’ve likely been told that Row-Level Security (RLS) is the ultimate "set it and forget it" tool for multi-tenant isolation. The pitch is seductive: instead of trusting your application code to append WHERE tenant_id = X to every single query, you move that logic into the database. You turn it on, define your policies, and sleep better knowing your data is siloed at the engine level.
This is a dangerous half-truth. While RLS provides a robust layer of defense-in-depth, it is not a free abstraction. In fact, if you aren't careful, RLS acts as a silent tax on your database performance, often sabotaging the Postgres query planner and forcing expensive sequential scans on tables with millions of rows—even when you have perfectly good indexes sitting right there.
The Invisible Rewrite
To understand why RLS can be slow, you have to understand how Postgres actually implements it. RLS isn't a secondary check that happens after the data is fetched. Instead, Postgres performs a query rewrite.
When you execute a query against a table with RLS enabled, the engine takes your SQL and physically merges it with the expression defined in your policy. If your query is SELECT * FROM orders, and your RLS policy is tenant_id = 'abc', Postgres effectively executes SELECT * FROM orders WHERE (tenant_id = 'abc').
On the surface, this sounds efficient. If you have an index on tenant_id, the planner should use it, right? Usually, yes. But the "Quiet Tax" appears when your policies move beyond simple column-to-value comparisons.
The Subquery Trap
The most common way developers kill their database performance is by putting subqueries inside RLS policies. Consider a standard multi-tenant application where users belong to organizations, and you want to ensure users only see rows belonging to their organization.
-- The Setup
CREATE TABLE organizations (
id UUID PRIMARY KEY,
name TEXT
);
CREATE TABLE users (
id UUID PRIMARY KEY,
org_id UUID REFERENCES organizations(id),
email TEXT,
is_admin BOOLEAN DEFAULT false
);
CREATE TABLE documents (
id SERIAL PRIMARY KEY,
org_id UUID REFERENCES organizations(id),
content TEXT
);
ALTER TABLE documents ENABLE ROW LEVEL SECURITY;
-- The "Simple" Policy
CREATE POLICY document_access_policy ON documents
FOR SELECT
USING (
org_id IN (
SELECT org_id FROM users WHERE id = auth.uid()
)
);In this scenario, auth.uid() is a function that returns the current user's ID (common in environments like Supabase or PostgREST).
On paper, this looks fine. But look at what happens when you query documents. Postgres has to evaluate that subquery for *every single candidate row* unless it can optimize the join. If you have 10,000 users and 1,000,000 documents, the planner might decide that the overhead of running that subquery repeatedly is too high, or worse, it might struggle to estimate the cost correctly and fall back to a Sequential Scan.
When I first encountered this in production, I had a table with 5 million rows and a composite index on (org_id, created_at). A simple query that should have taken 5ms was taking 2.5 seconds. The EXPLAIN ANALYZE showed the planner was ignoring the index entirely because it couldn't guarantee that the org_id from the subquery would be a constant it could use for an index scan.
Why the Planner Flinches: Leakproof Functions
Postgres is deeply paranoid about security—as it should be. There is a concept in Postgres called Leakproof Functions.
A function is leakproof if it does not reveal any information about its arguments except via its return value. Most built-in operators (like = or <) are leakproof. However, most user-defined functions are not.
If your RLS policy uses a function that isn't marked as LEAKPROOF, the Postgres optimizer is severely restricted. It cannot push your WHERE clauses down below the RLS filter if there's any risk that an error message (like a division by zero or a data type conversion error) could reveal data from a row you aren't supposed to see.
This is a subtle point, but it’s critical: Security barriers override optimization. If the planner thinks an index scan might "leak" information about the existence of a row that fails the RLS check, it will choose a slower, safer path.
The "Session Variable" Escape Hatch
If subqueries are the problem, what's the solution? You need to turn the dynamic security check into a constant that the planner can understand.
Instead of hitting the users table inside the policy, you can set a session-level variable (a GUC - Grand Unified Configuration setting) when the connection is established.
-- In your application code/middleware:
-- SET LOCAL app.current_org_id = '8888-9999-...';
CREATE POLICY document_access_fast_policy ON documents
FOR SELECT
USING (
org_id = current_setting('app.current_org_id')::uuid
);By using current_setting, you've removed the subquery. The planner now sees org_id = <some_constant_value>. It can look at the statistics for that index and say, "Oh, I know exactly how to find those rows."
The Gotcha: current_setting throws an error if the variable isn't set. Use the second argument to return NULL instead: current_setting('app.current_org_id', true).
Analyzing the Damage
To see if you're paying the RLS tax, you can't just run a standard EXPLAIN. By default, EXPLAIN as a superuser might bypass RLS. You need to test as a restricted role.
-- Create a test role
CREATE ROLE test_user;
GRANT SELECT ON documents TO test_user;
-- Switch to that role
SET ROLE test_user;
-- Set the security context if needed
SET app.current_org_id = '00000000-0000-0000-0000-000000000001';
EXPLAIN (ANALYZE, BUFFERS)
SELECT * FROM documents WHERE id = 500;Look closely at the output. Are you seeing a Filter that includes your RLS logic? If you see Parallel Seq Scan on a query that should be a simple Index Scan, your RLS policy is likely the culprit.
Redesigning for Speed: The Denormalization Trade-off
Sometimes, the session variable isn't enough, especially in complex RBAC (Role-Based Access Control) systems where a user has access to many organizations or projects.
I've seen systems where a user has access to 500 different org_ids. Putting all 500 into a session variable and using an IN clause works, but it's clunky.
A more performant (though higher maintenance) approach is to move the security metadata directly onto the table you are protecting. If access is determined by a "Project" but you're querying "Tasks," don't join through Projects in your RLS policy. Duplicate the allowed_users_ids array or a security_group_id onto the Tasks table itself.
-- Using an array index for RLS
CREATE TABLE tasks (
id SERIAL PRIMARY KEY,
project_id UUID,
content TEXT,
viewer_ids UUID[] -- Denormalized for RLS
);
CREATE INDEX idx_tasks_viewer_ids ON tasks USING GIN (viewer_ids);
CREATE POLICY task_access_policy ON tasks
FOR SELECT
USING (
viewer_ids @> ARRAY[current_setting('app.current_user_id')::uuid]
);Using a GIN index on a denormalized array of allowed IDs is often orders of magnitude faster than a policy that joins against a permissions table. Yes, you have to keep the array in sync, but that's the price of performance at scale.
The Pitfall of OR in Policies
Postgres RLS allows multiple policies on the same table. If you have three SELECT policies, Postgres joins them with an OR.
OR is the natural enemy of the query planner.
If you have:
1. A policy for is_admin = true
2. A policy for owner_id = current_user
3. A policy for is_public = true
Postgres essentially queries: WHERE (is_admin OR owner_id = X OR is_public).
While it *can* use indexes for this via a "Bitmap Or," it often gets confused if the statistics for those three conditions vary wildly. If you find your queries slowing down as you add more security rules, consider consolidating your logic into a single policy using a CASE statement or a more index-friendly structure.
Summary Checklist for High-Performance RLS
If your Postgres instance is sweating and you suspect RLS, walk through this list:
1. Avoid Subqueries: They are the #1 cause of Seq Scans in RLS. Use session variables (GUCs) or JWT claims passed from your app.
2. Check Your Roles: Ensure you are testing performance with a non-superuser role. Superusers usually bypass RLS, making your tests useless.
3. Index Your Policy Columns: Every column used in a USING clause should likely be indexed, often as the leading column in a composite index.
4. Consider `LEAKPROOF`: If you use custom functions in policies, and you are *absolutely sure* they can't be tricked into leaking data through error messages, mark them as LEAKPROOF.
5. Watch the Joins: If you must use a join in a policy, ensure the join order in the EXPLAIN plan makes sense. Sometimes you need to nudge the planner by adjusting statistics on the relevant columns.
6. Use GIN Indexes for Complex Access: If a row can be seen by many users, a denormalized array with a GIN index is your best friend.
RLS is a powerful feature that brings architectural elegance to multi-tenant apps. But remember: the database doesn't know your policy is "security." To the database, it's just more SQL to optimize. If you write bad SQL in your policy, you get a slow database. Security shouldn't have to be slow; you just have to stop treating RLS like a magic wand and start treating it like the query-writer it actually is.


