loke.dev
Header image for The Indecisive Commit

The Indecisive Commit

Atomicity is easy on a single node, but guaranteeing it across a distributed cluster requires a protocol that most developers fear to touch.

· 7 min read

Imagine you’re running a fintech application. A user initiates a $500 transfer from their savings account to their investment portfolio. In a monolithic Postgres setup, this is a breeze. You wrap two UPDATE statements in a BEGIN and COMMIT block. If the power goes out halfway through, the Write-Ahead Log (WAL) ensures that when the database wakes back up, either both updates happened or neither did.

But scale changes the physics of the problem. Your savings accounts are now on Shard A (London) and the investment portfolios are on Shard B (New York). You issue the first UPDATE to Shard A, it succeeds. You issue the second to Shard B, and the network cable at the New York data center gets chewed through by a confused rodent.

Shard A thinks the money is gone. Shard B never knew it was supposed to arrive. This is the "Indecisive Commit" problem. In a distributed environment, atomicity isn't a property you get for free; it’s a consensus problem that requires a protocol most developers treat like a radioactive isotope.

The Illusion of Atomicity

When we talk about ACID (Atomicity, Consistency, Isolation, Durability), Atomicity is the "all-or-nothing" guarantee. On a single node, this is implemented using a circular buffer or a log. The commit point is a single bit flip in the header of a transaction log. Once that bit is flipped, the transaction is "real."

In a distributed cluster, there is no single bit. There are multiple bits on multiple machines. If you tell three nodes to commit, and two say "okay" while the third says "I'm busy dying," your system is now in an inconsistent state. To solve this, Postgres (and almost every distributed database) uses a Two-Phase Commit (2PC).

Two-Phase Commit: The Digital Marriage

2PC is often described using the analogy of a wedding. The officiant (Coordinator) asks both parties (Participants) if they take this transaction to be their lawfully wedded state change. Only if both say "I do" does the marriage become official.

Phase 1: The Prepare Phase

Instead of sending a COMMIT, the coordinator sends a PREPARE TRANSACTION command.

-- On Shard A
BEGIN;
UPDATE accounts SET balance = balance - 500 WHERE id = 'savings_123';
PREPARE TRANSACTION 'tx_id_9999';

When Shard A receives PREPARE TRANSACTION, it does something very specific: it persists everything needed to commit the transaction to disk (the WAL records, the locks, the state), but it does not actually release the locks or mark the transaction as visible to other users. It then replies to the coordinator: "I am ready. I promise that even if I crash and reboot right now, I can still finish this transaction if you tell me to."

Phase 2: The Commit/Rollback Phase

If the coordinator receives a "Ready" from all shards, it sends the final command:

-- On Shard A and Shard B
COMMIT PREPARED 'tx_id_9999';

If even one shard fails to respond or sends an error during the prepare phase, the coordinator sends:

ROLLBACK PREPARED 'tx_id_9999';

Where It Gets Ugly: The Indecisive State

The reason 2PC has such a bad reputation is the "uncertainty window." This is the time between when a shard says "Ready" and when it receives the "Commit" or "Rollback" command.

During this window, the shard is indecisive. It has granted locks on the rows being updated. Because it promised the coordinator it would be able to commit, it cannot release those locks. If the coordinator crashes after the prepare phase but before the commit phase, those rows on Shard A are locked indefinitely.

I’ve seen production systems grind to a halt because a "zombie" prepared transaction held a lock on a heavily used table for hours. In standard Postgres, these prepared transactions survive restarts. You can’t just "reboot the problem away."

Querying the Dead

If your application starts timing out on simple updates, your first stop should be checking for these orphaned prepared transactions:

SELECT gid, prepared, owner, database 
FROM pg_prepared_xacts;

If you find an entry there that’s been sitting for longer than a few seconds, you’ve found the source of your indecision. You have to manually resolve it:

ROLLBACK PREPARED 'tx_id_9999';

Implementing 2PC in Your Application

If you're building a distributed system on top of vanilla Postgres instances, you are the Coordinator. You are responsible for managing the state of the transaction. Here is a simplified look at how you might handle this in Python using psycopg2.

import psycopg2
from uuid import uuid4

def distributed_transfer(shard_a_conn, shard_b_conn, amount):
    # We need a unique Global Transaction ID
    gtid = f'trans_{uuid4()}'
    
    try:
        # Step 1: Execute work on Shard A
        cursor_a = shard_a_conn.cursor()
        cursor_a.execute("BEGIN;")
        cursor_a.execute("UPDATE accounts SET balance = balance - %s WHERE id = 1;", (amount,))
        cursor_a.execute(f"PREPARE TRANSACTION '{gtid}';")

        # Step 2: Execute work on Shard B
        cursor_b = shard_b_conn.cursor()
        cursor_b.execute("BEGIN;")
        cursor_b.execute("UPDATE accounts SET balance = balance + %s WHERE id = 2;", (amount,))
        cursor_b.execute(f"PREPARE TRANSACTION '{gtid}';")

        # Both said they are ready. Now we commit.
        cursor_a.execute(f"COMMIT PREPARED '{gtid}';")
        cursor_b.execute(f"COMMIT PREPARED '{gtid}';")
        
        shard_a_conn.commit()
        shard_b_conn.commit()
        print("Success!")

    except Exception as e:
        # If anything failed, we attempt to rollback both
        print(f"Error: {e}. Attempting cleanup...")
        try:
            shard_a_conn.cursor().execute(f"ROLLBACK PREPARED '{gtid}';")
        except: pass
        try:
            shard_b_conn.cursor().execute(f"ROLLBACK PREPARED '{gtid}';")
        except: pass
        raise

The Gotcha: What happens if the Python script crashes after cursor_a.execute(f"COMMIT PREPARED '{gtid}';") but before the second one? You now have a partially committed transaction. To be truly safe, you need a Transaction Manager—a piece of infrastructure that logs the status of the gtid to a persistent store so it can resume the commit process if the worker dies.

Postgres Configuration Pitfalls

By default, Postgres allows exactly zero prepared transactions. If you try to run the code above on a fresh install, it will fail. You need to adjust postgresql.conf:

# postgresql.conf
max_prepared_transactions = 100 

A common mistake is setting this value too low. If you hit the limit, your distributed transactions will start failing during the PREPARE phase, which is actually the "safe" time for them to fail.

Another setting to watch is idle_in_transaction_session_timeout. However, this usually doesn't affect prepared transactions because, by definition, a prepared transaction is no longer associated with a session. Once a transaction is prepared, the connection can be closed, and the transaction lives on in the engine. This is both a feature (resilience) and a bug (easy to forget about).

Is 2PC Dead?

Many modern "Distributed SQL" databases like CockroachDB, YugabyteDB, or TiDB move away from traditional 2PC in favor of consensus algorithms like Raft or Paxos.

In a Raft-based system, you don't ask every node to prepare. You write your transaction to a "Leader" node, which replicates the intent to a majority of "Followers." As long as a quorum (2 out of 3, or 3 out of 5) agrees, the transaction is committed.

However, even these databases often use a 2PC-like flow internally when a transaction spans multiple Raft groups (shards). They've just automated the "Transaction Manager" part so you don't have to manually ROLLBACK PREPARED when a node disappears.

Performance: The Hidden Cost of Decision-Making

The real reason developers fear the indecisive commit isn't just the complexity; it's the latency.

1. Network Round Trips: In a single-node transaction, you have one COMMIT call. In a 2PC, you have PREPARE (round trip) + COMMIT (round trip). If your shards are in different regions, you’re looking at hundreds of milliseconds of latency just for the protocol overhead.
2. Lock Contention: In a standard transaction, locks are held for milliseconds. In 2PC, locks are held for the duration of the network round trips. If Shard B is slow to respond to the PREPARE request, Shard A is sitting there holding locks on its rows, blocking other transactions.

If you find yourself needing 2PC across high-traffic tables, you should first ask if you can re-shard your data. Can you move the "Savings" and "Investment" accounts for a single user onto the same node? If you can colocate the data, you can go back to local transactions and avoid the indecisive commit entirely.

When to Reach for the Protocol

Despite the pain, 2PC is necessary when you cannot compromise on consistency.

- Financial Ledgering: When the books must balance to the penny.
- Cross-Service Orchestration: When you need to sync a database change with a message queue state (Postgres supports 2PC with XA transactions for this).
- Hard State Transitions: Where "eventual consistency" (Sagas) would result in a broken business state that is too expensive to fix manually.

If you must use it, invest in observability. Build a dashboard that monitors pg_prepared_xacts. Set up an alert for any transaction older than 60 seconds. Because in the world of distributed systems, indecision is more than a psychological hurdle—it's a performance killer that can take your entire cluster down with it.