Deadlock Detection & Prevention

The Circular Wait Problem šŸŒ€

Deadlock occurs when two or more transactions wait for each other indefinitely, creating a circular dependency that prevents any of them from proceeding.

Classic Deadlock Example

T1: LOCK_X(Account_A)
T2: LOCK_X(Account_B)
T1: LOCK_X(Account_B)  ← Waits for T2 to release B
T2: LOCK_X(Account_A)  ← Waits for T1 to release A

Result: Both transactions are stuck forever! šŸ’€


Deadlock Detection: Wait-for Graphs šŸ“Š

The Graph Approach

A wait-for graph is a directed graph where:

Building a Wait-for Graph

Scenario:

  • T1 holds LOCK_X(A), wants LOCK_X(B)
  • T2 holds LOCK_X(B), wants LOCK_X(C)
  • T3 holds LOCK_X(C), wants LOCK_X(A)

Wait-for Graph:

T1 → T2 → T3 → T1

Cycle detected! This indicates a deadlock.

Deadlock Detection Algorithm

def detect_deadlock(wait_for_graph):
    """
    Use DFS to detect cycles IN wait-for graph
    Returns: List of transactions IN deadlock (if any)
    """

    def dfs(node, visited, rec_stack, path):
        visited.add(node)
        rec_stack.add(node)
        path.append(node)

        for neighbor IN wait_for_graph[node]:
            if neighbor IN rec_stack:
                # Found cycle - extract deadlock
                cycle_start = path.index(neighbor)
                return path[cycle_start:]

            if neighbor NOT IN visited:
                result = dfs(neighbor, visited, rec_stack, path)
                if result:
                    return result

        rec_stack.remove(node)
        path.pop()
        return None

    for TRANSACTION IN wait_for_graph:
        if TRANSACTION NOT IN visited:
            cycle = dfs(TRANSACTION, SET(), SET(), [])
            if cycle:
                return cycle

    return None  # No deadlock

Deadlock Prevention Strategies šŸ›”ļø

1. Lock Ordering (Conservative Approach)

šŸŽÆ Global Lock Ordering

Rule: All transactions must acquire locks in the same predefined order

Example: Alphabetical Ordering

āŒ Can Deadlock
T1: LOCK(Account_B) → LOCK(Account_A)
T2: LOCK(Account_A) → LOCK(Account_B)
āœ… Deadlock-Free
T1: LOCK(Account_A) → LOCK(Account_B)
T2: LOCK(Account_A) → LOCK(Account_B)

T2 waits for T1 to finish, no circular wait!

āœ… Advantages

  • Completely prevents deadlocks
  • Simple to understand and implement
  • No deadlock detection overhead

āŒ Disadvantages

  • Reduces concurrency significantly
  • Requires knowing all locks in advance
  • May force unnecessary lock ordering

2. Timeout-Based Prevention

ā° Wait Timeout

Rule: If a transaction waits for a lock longer than a threshold, abort it

```sql -- PostgreSQL example BEGIN; SET lock_timeout = '5s'; -- Wait max 5 seconds for locks SELECT * FROM accounts WHERE id = 1 FOR UPDATE; -- If can't get lock in 5s, transaction aborted ```

āš–ļø Trade-offs

  • Short timeout: Prevents long waits, but may abort valid transactions
  • Long timeout: Reduces unnecessary aborts, but deadlocks persist longer
  • Adaptive: Adjust timeout based on system load and transaction history

3. Timestamp-Based Protocols

šŸ• Wait-Die and Wound-Wait

āš°ļø Wait-Die Protocol

Rule: Older transactions wait, younger transactions die

  • If Ti (older) wants lock held by Tj (younger): Ti waits
  • If Ti (younger) wants lock held by Tj (older): Ti dies (aborts)
Example:

T1(timestamp=10) wants lock held by T2(timestamp=20)
→ T1 is older, so T1 waits

T3(timestamp=30) wants lock held by T1(timestamp=10)
→ T3 is younger, so T3 aborts

šŸ¤• Wound-Wait Protocol

Rule: Older transactions wound younger, younger wait for older

  • If Ti (older) wants lock held by Tj (younger): Tj dies (wounded)
  • If Ti (younger) wants lock held by Tj (older): Ti waits
Example:

T1(timestamp=10) wants lock held by T2(timestamp=20)
→ T1 is older, so T2 aborts (wounded)

T3(timestamp=30) wants lock held by T1(timestamp=10)
→ T3 is younger, so T3 waits


Deadlock Resolution: Victim Selection šŸŽÆ

When a deadlock is detected, one transaction must be aborted to break the cycle.

Victim Selection Criteria

šŸ“Š Transaction Cost

  • Choose transaction with least work done
  • Minimize wasted computation
  • Consider CPU time, I/O operations, locks held

šŸ• Transaction Age

  • Abort youngest transaction (most recently started)
  • Older transactions likely closer to completion
  • Prevents starvation of long-running transactions

šŸ”„ Abort Count

  • Avoid repeatedly aborting same transaction
  • Track previous abort history
  • Give priority to previously aborted transactions

šŸ” Lock Count

  • Abort transaction holding fewest locks
  • Releasing fewer locks reduces system disruption
  • Other transactions can proceed faster

Victim Selection Algorithm

def select_victim(deadlock_cycle):
    """
    SELECT best victim FROM deadlock cycle
    Returns: TRANSACTION to abort
    """
    best_victim = None
    min_cost = float('inf')

    for TRANSACTION IN deadlock_cycle:
        # Calculate composite cost score
        cost = (
            TRANSACTION.work_done * 0.4 +           # 40% weight
            TRANSACTION.locks_held * 0.3 +          # 30% weight  
            TRANSACTION.abort_count * (-0.2) +      # 20% weight (negative = priority)
            TRANSACTION.age * (-0.1)                # 10% weight (negative = priority)
        )

        if cost < min_cost:
            min_cost = cost
            best_victim = TRANSACTION

    return best_victim

Real-World Deadlock Handling šŸ­

PostgreSQL Approach

-- Configure deadlock detection
SET deadlock_timeout = '1s';  -- Check for deadlocks after 1 second

-- PostgreSQL automatically:
-- 1. Detects deadlocks using wait-for graphs
-- 2. Selects victim (usually newest transaction)
-- 3. Aborts victim with ERROR: deadlock detected

MySQL InnoDB Strategy

-- InnoDB deadlock detection is automatic
-- Configuration options:
SET GLOBAL innodb_deadlock_detect = ON;    -- Enable detection
SET GLOBAL innodb_lock_wait_timeout = 50;  -- Max wait time

-- InnoDB automatically:
-- 1. Detects deadlocks immediately 
-- 2. Selects victim with least cost (estimated by undo work)
-- 3. Returns ERROR 1213: Deadlock found when trying to get lock

Performance Implications ⚔

šŸ” Detection Overhead

  • Wait-for graphs: O(n²) space, O(n²) time per check
  • Check frequency: Every few seconds (configurable)
  • Trade-off: Frequent checks vs. longer deadlock duration

šŸ’” Abort Cost

  • Rollback work: Undo all changes made by victim
  • Lock release: Free all locks held by victim
  • Retry cost: Re-execute aborted transaction

šŸš€ Prevention Cost

  • Lock ordering: Reduced concurrency, complex planning
  • Timeouts: False positives, unnecessary aborts
  • Timestamps: More frequent aborts, retry overhead

Best Practices šŸŽÆ

šŸŽÆ Application Design

  • Short transactions: Reduce lock holding time
  • Consistent access patterns: Use same table ordering
  • Avoid user interaction: Don't wait for user input while holding locks

āš™ļø Database Configuration

  • Tune timeouts: Balance detection speed vs. false positives
  • Monitor deadlocks: Track frequency and patterns
  • Isolation levels: Lower levels may reduce deadlock chance

šŸ”„ Error Handling

  • Retry logic: Automatically retry deadlock victims
  • Exponential backoff: Avoid retry storms
  • User feedback: Gracefully handle deadlock errors

Key Takeaways 🧠

šŸŽÆ Essential Points

  • Deadlocks are inevitable: With 2PL, deadlocks can occur despite correctness
  • Detection vs. Prevention: Trade-off between runtime overhead and concurrency
  • Victim selection matters: Good algorithms minimize total system cost
  • Application awareness: Smart design patterns can reduce deadlock frequency
  • Modern systems handle it: Database engines automatically detect and resolve deadlocks

What's Next? šŸ”œ

Now you understand how to handle deadlocks. Continue learning:

  1. Lock Examples - See deadlocks and resolution in action

  2. Two-Phase Locking - Review how 2PL creates deadlock potential

  3. Lock Foundations - Back to the basics