Involutions: The License to Update in O(1)

Why Zobrist hashing updates a chessboard in constant time, and the property that makes incremental computation possible.

Jan 14, 2026

For best reading experience consider reading on my github.

Here’s a puzzle from competitive chess programming:

A chess engine evaluates millions of positions per second. Many are duplicates—reached via different move orders. How does it recognize positions it’s already seen?

The naive answer: hash the board and check a table.

But “hash the board” hides a cost. A chessboard has 64 squares, each potentially holding one of 12 piece types. Hashing from scratch is O(64) per position.

At millions of positions per second, that adds up.

The actual answer: update the hash incrementally in O(1).

Move a knight from b1 to c3? Two operations. Not 64.

The technique is called Zobrist hashing. And it works because of a single mathematical property.

The Property

An involution is a function that is its own inverse:

f(f(x))=xf(f(x))=x

Apply it twice, you’re back where you started.

The simplest example: XOR.

a⊕a=0a⊕a=0

a⊕0=aa⊕0=a

XOR any value with itself, you get zero. XOR with zero, you get the original back. This means XOR is reversible—and the forward and reverse operations are identical.

This unlocks three capabilities:

Add = Remove: Same operation to insert and delete
No history needed: Don’t track what was added when
O(1) updates: Cost independent of total size

To modify a hash, you don’t need to know how it was built—just what changed.

Zobrist Hashing

Albert Zobrist invented this technique in 1970 for computer Go. It’s now standard in every serious chess engine.

Setup

Generate a table of random 64-bit numbers:

import random

# One random number for each (piece, square) combination
zobrist_table = {}
for piece in ['P', 'N', 'B', 'R', 'Q', 'K', 'p', 'n', 'b', 'r', 'q', 'k']:
    for square in range(64):
        zobrist_table[(piece, square)] = random.getrandbits(64)

# Additional random numbers for game state
zobrist_black_to_move = random.getrandbits(64)
zobrist_castling = [random.getrandbits(64) for _ in range(16)]
zobrist_en_passant = [random.getrandbits(64) for _ in range(8)]

These numbers are generated once at startup and never change.

Initial Hash

To hash a position from scratch:

def hash_position(board):
    h = 0
    for square in range(64):
        piece = board[square]
        if piece:
            h ^= zobrist_table[(piece, square)]

    if board.black_to_move:
        h ^= zobrist_black_to_move
    # ... castling rights, en passant, etc.

    return h

XOR together the random numbers for every piece on its square. O(n) where n is the number of pieces.

Incremental Update

Here’s where involutions pay off.

When a knight moves from b1 (square 1) to c3 (square 18):

def update_hash_for_move(h, piece, from_sq, to_sq):
    h ^= zobrist_table[(piece, from_sq)]  # Remove piece from old square
    h ^= zobrist_table[(piece, to_sq)]    # Add piece to new square
    h ^= zobrist_black_to_move            # Toggle side to move
    return h

Three XOR operations. Constant time. The hash after the move is exactly what you’d get from hashing the new position from scratch.

Why does this work?

XOR is an involution. “Removing” a piece means XORing with the same value that “added” it. No separate undo operation needed.

hash = ... ^ zobrist[N][b1] ^ ...     # Knight on b1 contributes to hash

hash ^= zobrist[N][b1]                 # XOR same value → removes contribution
hash ^= zobrist[N][c3]                 # XOR new value → adds new contribution

The knight’s contribution to the hash is completely replaced. The 62 other squares are unaffected.

Try It Yourself

I built an interactive demo where you can move pieces and watch the XOR operations happen in real-time:

→ Interactive Zobrist Hashing Demo

Notice: whether you move a pawn (1 XOR out, 1 XOR in) or capture a queen (2 XOR out, 1 XOR in), the operation count stays constant. The other 62 squares are never touched.

Why XOR Specifically?

XOR isn’t the only involution. Why not use subtraction? After all:

a−a=0a−a=0

Three reasons make XOR ideal for hashing:

1. Bitwise Independence

XOR operates on each bit independently. No carries, no overflow.

  10110011
^ 11001010
----------
  01111001

This means the random numbers can be generated for each bit position separately, maximizing entropy distribution.

2. Order Independence (Commutativity)

a⊕b=b⊕aa⊕b=b⊕a

The hash doesn’t depend on the order pieces were added. Two positions with the same pieces produce the same hash regardless of move history.

3. No Accumulation Bias

With addition/subtraction, values can grow unboundedly or cluster around certain ranges. XOR keeps the hash uniformly distributed across the full 64-bit range.

OperationSelf-inverse?Commutative?Bounded?Bit-independent?XOR✓ (a⊕a=0)✓✓✓Addition/Subtraction✗ (need inverse op)✓✗✗Multiplication✗✓✗✗

XOR is unique: adding and removing use the same operation. With addition, you must subtract to undo—a different operation that requires knowing what was added.

The Transposition Table

Zobrist hashing enables transposition tables—caches of previously evaluated positions.

1. e4 e5 2. Nf3 Nc6     →  position P
1. Nf3 Nc6 2. e4 e5     →  same position P

Different move orders, identical position. Same Zobrist hash.

When the engine encounters a position:

def evaluate_with_cache(position, depth):
    h = position.zobrist_hash

    if h in transposition_table:
        cached = transposition_table[h]
        if cached.depth >= depth:
            return cached.score  # Skip expensive search

    score = expensive_minimax_search(position, depth)
    transposition_table[h] = Entry(score, depth)
    return score

At depth 20, a chess engine might explore 10⁹ positions. With a good transposition table, it might only evaluate 10⁷ unique positions. The rest are cache hits.

O(1) hash updates make this practical. If updating the hash cost O(64) per move, the overhead would dwarf the savings.

Collision Reality

Zobrist hashing isn’t cryptographically secure. Collisions happen.

With a 64-bit hash and 2³² positions, the birthday paradox predicts roughly one collision. In practice, chess engines search far more positions than that.

Does it matter?

Usually not. A collision means two different positions share a hash. The engine might:

Return the wrong cached evaluation
Prune a branch incorrectly

These are rare and the impact is statistical, not catastrophic. The engine plays slightly worse on rare occasions. Acceptable for a 100x speedup.

For applications where correctness matters absolutely, you’d verify the full position on cache hits—but that adds overhead.

Beyond Chess: Rolling XOR

The same principle applies anywhere you need incremental hash updates.

Sliding window over a byte stream:

from collections import deque

def rolling_xor_hash(window_size):
    h = 0
    buffer = deque()

    def add_byte(b):
        nonlocal h
        h ^= b
        buffer.append(b)
        if len(buffer) > window_size:
            old = buffer.popleft()
            h ^= old  # Remove old byte's contribution
        return h

    return add_byte

Each byte added, one byte removed. O(1) regardless of window size.

This is simpler than polynomial rolling hashes (Rabin-Karp), though it has weaker collision properties. For applications where speed matters more than collision resistance—like deduplication with verification—it’s often sufficient.

Beyond Chess: CRDTs

Conflict-free Replicated Data Types (CRDTs) enable distributed systems to merge state without coordination.

XOR-based sets exploit involutions:

class XORSet:
    def __init__(self):
        self.hash = 0
        self.elements = set()

    def add(self, x):
        if x not in self.elements:
            self.hash ^= hash(x)
            self.elements.add(x)

    def remove(self, x):
        if x in self.elements:
            self.hash ^= hash(x)  # Same operation as add!
            self.elements.discard(x)

Two replicas can independently add and remove elements. Merging is deterministic because XOR is commutative and associative.

Beyond Chess: The rsync Algorithm

Here’s a different flavor of incremental hashing—one that synchronizes files across a network.

The problem: You have a 1GB file locally. The server has a slightly modified version. Sending the entire file wastes bandwidth. But you don’t know which bytes changed.

rsync’s insight: Use rolling checksums to find matching blocks without transferring them.

The Protocol

Receiver (has old file): Split into blocks, compute weak checksum (Adler-32) + strong checksum (MD5) for each block, send list to sender
Sender (has new file): Compute rolling checksum at every byte offset, match against list
Result: Send only “match block B2” references + the changed bytes

The magic is computing checksums at every byte offset efficiently.

The Rolling Checksum

rsync uses a variant of Adler-32. For a window of bytes [a1,a2,…,an][a1,a2,…,an]:

A=∑i=1naimodMA=i=1∑naimodM

B=∑i=1n(n−i+1)⋅aimodMB=i=1∑n(n−i+1)⋅aimodM

When the window slides one byte (drop a1a1, add an+1an+1):

A′=A−a1+an+1A′=A−a1+an+1

B′=B−n⋅a1+A′B′=B−n⋅a1+A′

O(1) per byte, regardless of window size. No need to re-sum all n bytes.

Involutions vs. Rolling Updates

Zobrist and rsync both achieve O(1) updates, but through different mechanisms:

TechniqueMathematical BasisUpdate OperationZobristXOR involution: a⊕a=0a⊕a=0Toggle element membershiprsyncAdditive sliding: A′=A−aold+anewA′=A−aold+anewSlide window by one position

Zobrist uses XOR’s self-inverse property—add and remove are the same operation.

rsync uses modular arithmetic’s linearity—you can subtract the old contribution and add the new one separately.

Both exploit structure to avoid recomputation. The algebra differs, but the principle is identical: incremental updates beat full recalculation.

The Counter-Example: Cryptographic Hashes

SHA-256 is deliberately not invertible.

h1 = sha256(data)
# There is no sha256_remove(h1, partial_data)

If you change one byte, you must rehash everything. That’s the point—cryptographic hashes are designed so you can’t deduce anything about the input from the output.

Hash TypeIncremental?Collision-resistant?Use CaseZobrist (XOR)✓ O(1)WeakGame trees, cachesRolling polynomial✓ O(1)ModerateString matchingSHA-256✗ O(n)StrongSecurity, integrityMerkle tree✓ O(log n)StrongBlockchain, git

Merkle trees are an interesting middle ground: O(log n) updates with cryptographic security. You sacrifice the O(1) of XOR but gain collision resistance.

Designing for Involutions

When you need incremental hash updates, ask:

1. What’s changing?

Single element: Zobrist-style XOR
Sliding window: Rolling hash
Tree structure: Merkle tree

2. How bad are collisions?

Recoverable (cache miss is fine): XOR
Needs verification: XOR + full comparison
Catastrophic (security): Merkle or rehash

3. What operations do you need?

Add only: Many options
Add and remove: Need involution
Order matters: Can’t use commutative XOR

The involution property isn’t just an optimization. It determines what operations are even possible in O(1).

The Takeaway

Zobrist hashing achieves O(1) updates because XOR is an involution—its own inverse.

Adding and removing are the same operation. Move a piece? XOR out the old square, XOR in the new square. The hash updates without knowing anything about the other 62 squares.

This single property—f(f(x))=xf(f(x))=x—enables:

Chess engines to cache millions of positions
Distributed systems to merge state without coordination
Streaming algorithms to maintain hashes over sliding windows

The involution came first. The algorithms followed.

When you need incremental updates, don’t reach for clever data structures.

Reach for the algebra.

Full article with interactive demo: ttsugriy.github.io/perf-bits/involutions.html

Software Bits Newsletter

Discussion about this post

Ready for more?