Home

When Secure Handshakes Collide: Key Rotation Conflicts in Modern Protocols


Secure communication protocols—like Transport Layer Security (TLS), QUIC, Secure Shell (SSH), and IP Security (IPsec)—all rely on an initial handshake phase to establish encryption keys and authenticate peers. But what happens when you try to rotate keys in the middle of an active session?

You can end up with:

  • Desynchronized states (one side on new keys, the other side on old keys)
  • Security risks (downgrade or injection attacks)
  • Performance hiccups (extra latency or connection resets)

In this post, we’ll explore why these conflicts happen, how widely-used protocols mitigate them, and a “theory-into-practice” look at per-message key ratcheting with the Noise protocol. We’ll even touch on whether protocols can push out expiration times to avoid mid-flow rekeys, and why that’s usually discouraged.

Random aside: If you’re into leadership or team management, you might enjoy this review of Managing Humans by Michael Lopp. Because well-run teams = fewer handshake fiascos in production.

Why Do Handshake/Session Rotation Conflicts Occur?

A secure handshake typically negotiates cryptographic parameters (ciphers, keys, etc.). Once established, data flows under those session keys. Problems arise if you try to re-run a handshake or switch keys mid-session:

  1. Concurrent Data & Control: The handshake messages interleave with application data. If the protocol logic isn’t careful, you get confusion about which key is in use.
  2. Renegotiation vs. New Session: Some older designs (e.g., TLS 1.2) offered a “renegotiation” inside an existing channel, but discovered vulnerabilities like CVE-2009-3555.
  3. Timing Collisions: In protocols like SSH or IPsec, both sides might attempt to rekey simultaneously, causing a race condition.
  4. State Machine Complexity: The more complicated the handshake, the more corner cases can crop up—like partial upgrades or leftover states from old keys.

The TLS 1.2 Renegotiation Vulnerability (CVE-2009-3555)

TLS 1.2 allowed a new handshake mid-connection. This was neat for features like requesting client certificates after some initial data, but it opened the door to session-splicing attacks. In the CVE-2009-3555 exploit, an attacker could prepend malicious data to a legitimate client’s session during renegotiation—because the new handshake wasn’t cryptographically tied to the original one.

The fix came in RFC 5746, which forced “secure renegotiation,” so the new handshake carried a token binding it to the old session. Even with the fix, renegotiation remained complicated, prompting TLS 1.3 to remove it entirely.

TLS 1.3 Key Update Flow

To illustrate how TLS 1.3 avoids full renegotiation, here’s a Mermaid diagram for the KeyUpdate process, which rotates symmetric keys without a second handshake.

sequenceDiagram
    autonumber

    participant Client
    participant Server

    rect rgba(273, 116, 230, 0.3)
    Note over Client,Server: Initial TLS 1.3 handshake is complete, keys are established.
    Client->>Server: (Encrypted Data) using old key
    Server->>Client: (Encrypted Data) using old key
    end

    Note over Client: Client decides to rotate keys (e.g., after X MB or time)
    Client->>Server: KeyUpdate message (still encrypted w/ old key)

    rect rgb(173, 100, 47)
    Note over Client,Server: Client immediately switches to new key for sending.
    Server->>Server: Derive new key from old key
    Server->>Client: ACK or next data with new key
    end

    Note over Client,Server: Both now use new key, old key is discarded.

Quick Observations

  • No Full Renegotiation: Just a lightweight control message.
  • Bound to the Current Session: The KeyUpdate is encrypted under existing keys, so it can’t be forged or inserted by an outsider.
  • Limited Forward Secrecy: The new keys are derived from the old ones, so if you need a new Diffie-Hellman for each rekey, that’s not built-in by default. (Some proposals exist for “post-handshake PFS” in TLS 1.3.)

Idle vs. Active Key Rotation: Can We Just Push Out Expiration?

You might wonder: what if we only rekey when the session is idle, thereby avoiding mid-flow disruptions? Some custom setups or application layers do exactly that. But at the cryptographic-protocol level (TLS, SSH, IPsec, etc.):

  • Hard Key Lifetimes are the norm. Protocols typically enforce a maximum time or data limit for a key’s usage (the cryptoperiod) to maintain forward secrecy. Letting a single key run too long, even if it’s still “active,” can be risky if an attacker is collecting encrypted data hoping to break it.
  • Bidirectional Requirements: In a 2-way session, both sides must agree when to rekey. If one side tries to push out the lifetime, but the other side hits its timer, you still get a rekey event.
  • Security Over Convenience: Mid-stream rekey might be inconvenient, but reusing the same key for a huge data volume can degrade security.

Hence, in practice, popular protocols do not typically keep punting the session’s expiration. They revolve keys on schedule, active or not.

SSH Rekey in Action

Secure Shell (SSH), standardized in RFC 4253, has a concept of rekeying after a certain amount of data or time. Below is a Mermaid diagram showing how SSH typically does an in-band key exchange mid-session:

sequenceDiagram
    participant Client
    participant Server

    Note over Client,Server: SSH session established (keys set, user authenticated).

    loop Data exchange
        Client->>Server: Application data
        Server->>Client: Application data
    end

    Note over Client,Server: Check rekey threshold periodically
    alt Threshold reached
        rect rgba(255, 223, 186, 0.3)
        Client->>Server: SSH_MSG_KEXINIT (propose new key exchange)
        Server->>Client: SSH_MSG_KEXINIT (agree on new KEX algorithms)
        par Key Exchange
            Client->>Server: Diffie-Hellman exchange (ephemeral key)
            Server->>Client: Diffie-Hellman exchange (ephemeral key)
        end
        Note over Client,Server: Both derive fresh keys
        Client->>Server: SSH_MSG_NEWKEYS
        Server->>Client: SSH_MSG_NEWKEYS
        Note over Client,Server: Switch to new keys. Continue data flow.
        end
    else Continue without rekeying
        Note over Client,Server: No rekeying needed yet
    end

Why This Matters

  • Continuous Security: An attacker who breaks one key can’t decrypt the entire session. Once a rekey happens, the old key is worthless going forward.
  • Limited Data per Key: Prevents large-scale cryptanalysis on a single key.
  • Implementation Quirks: Some older SSH servers or devices might hang on rekey. Modern implementations (like OpenSSH) handle it seamlessly.

IPsec (Internet Protocol Security) Briefly

IPsec also rekeys, but at the IP layer. Through IKEv2 (Internet Key Exchange version 2), gateways negotiate fresh Security Associations (SAs) overlapping the old ones, so traffic rarely sees a glitch. If both sides rekey simultaneously, you can get collisions or duplicate SAs—hence the recommendation to stagger timers.

Theory into Practice: Per-Message Ratcheting with Noise Protocol

Beyond standard rekeys, some protocols do ultra-frequent key rotations—even on a per-message basis. The Noise Protocol Framework is a popular toolkit for building secure channels and powering systems like WireGuard and Signal’s Double Ratchet.

Why Per-Message Ratcheting?

  • Continuous Forward Secrecy: If an attacker compromises a key at time T, future messages are safe because the key ratchets forward.
  • Minimal Attack Surface: A “ratchet” means each message (or each small batch) uses a fresh key derived from ephemeral secrets. If one key is compromised, only that small chunk is at risk.

Pseudocode Example

Below is simplified pseudocode for how you might implement a per-message key ratchet in a Noise-based protocol. We’ll assume an existing one-way “send” chain for the sender, and a matching “receive” chain for the receiver.

# We have two primary states for each party:
#   1) send_chain_key
#   2) receive_chain_key
#
# Each message increments the ratchet, deriving new ephemeral keys.

def ratchet_send(chain_key, plaintext_message):
    """
    1) Derive ephemeral key from chain key
    2) Encrypt message
    3) Return ciphertext + new chain key
    """
    # Derive ephemeral key
    ephemeral_key = HMAC(chain_key, b"derive_ephemeral")
    # Derive next chain key
    next_chain_key = HMAC(chain_key, b"ratchet_step")

    # Encrypt the plaintext
    nonce = generate_nonce()
    ciphertext = encrypt(ephemeral_key, nonce, plaintext_message)

    return ciphertext, next_chain_key

def ratchet_receive(chain_key, ciphertext):
    """
    1) Derive ephemeral key from chain key
    2) Decrypt message
    3) Return plaintext + new chain key
    """
    # Derive ephemeral key
    ephemeral_key = HMAC(chain_key, b"derive_ephemeral")
    # Derive next chain key
    next_chain_key = HMAC(chain_key, b"ratchet_step")

    # Decrypt the ciphertext
    # In a real Noise protocol, you'd also handle MACs, reordering, etc.
    nonce = extract_nonce(ciphertext)
    plaintext = decrypt(ephemeral_key, nonce, ciphertext)

    return plaintext, next_chain_key

Key Steps:

  1. Ephemeral Key: Use a hash-based key derivation to get a one-time encryption key for each message.
  2. Next Chain Key: After each message, the chain key is replaced (think of it like iterating a hash function).
  3. Forward Secrecy: If an attacker obtains your ephemeral key for message N, it won’t help them decrypt message N+1.

Note: Real Noise usage can also combine ephemeral Diffie-Hellman for each message or each session. This snippet is a simplified approach to show the ratchet concept. Actual Noise-based protocols handle additional details (like handshake patterns, transport messages, reordering, etc.).

Wrapping Up

Handshake or session rotation conflicts are a natural challenge in any secure protocol that needs to refresh keys mid-connection. We’ve seen:

  • TLS 1.2 encountered serious renegotiation flaws (like CVE-2009-3555).
  • TLS 1.3 simplified matters by removing renegotiation and using KeyUpdate.
  • SSH and IPsec do mid-stream rekeys on a schedule, staggering to avoid collisions.
  • Idle-based deferral of rekeys is rare because cryptoperiods keep you safer by not overusing a single key.
  • Cutting-edge approaches (Noise, Double Ratchet) push rekey frequency to the extreme, sometimes on a per-message basis, for even stronger forward secrecy.

If you want more details, see:

Happy hacking—and may your session rotations always be conflict-free!