The Diffie-Hellman key exchange

Try a demo of the protocol

The math and definitions explained

The exponent of a number indicates how many times the number is multiplied by itself. For example:

\( 4^{3} = 4 \cdot 4 \cdot 4 = 64 \)

Here, 3 is the exponent (or power), and 4 is the base. In words, we say "4 to the power of 3."

Here are the laws of exponents:

Law Example
\( x^{1} = x \) \( 3^{1} = 3 \)
\( x^{0} = 1 \) \( 4^{0} = 1 \)
\( x^{-1} = \frac{1}{x} \) \( 5^{-1} = \frac{1}{5} \)
\( x^{m} \cdot x^{n} = x^{m+n} \) \( x^{2} \cdot x^{3} = (x \cdot x) \cdot (x \cdot x \cdot x) = x \cdot x \cdot x \cdot x \cdot x = x^{5} = x^{2+3} \)
\( \frac{x^{m}}{x^{n}} = x^{m-n} \) \( \frac{x^{4}}{x^{2}} = \frac{x \cdot x \cdot x \cdot x}{x \cdot x} = x \cdot x \cdot \frac{x \cdot x}{x \cdot x} = x \cdot x \cdot 1 = x^{2} = x^{4-2} \)
\( (x^{m})^{n} = x^{m \cdot n} \) \( (x^{2})^{3} = (x \cdot x) \cdot (x \cdot x) \cdot (x \cdot x) = x \cdot x \cdot x \cdot x \cdot x \cdot x = x^{6} = x^{2 \cdot 3} \)
\( (x \cdot y)^{n} = x^{n} \cdot y^{n} \) \( (x \cdot y)^{2} = (x \cdot y) \cdot (x \cdot y) = x \cdot x \cdot y \cdot y = x^{2} \cdot y^{2} \)
\( (\frac{x}{y})^{n} = \frac{x^{n}}{y^{n}} \) \( (\frac{x}{y})^{3} = (\frac{x}{y}) \cdot (\frac{x}{y}) \cdot (\frac{x}{y}) = \frac{x \cdot x \cdot x}{y \cdot y \cdot y} = \frac{x^{3}}{y^{3}} \)
\( x^{-n} = \frac{1}{x^{n}} \) \( x^{-3} = (x^{-1})^{3} = (\frac{1}{x})^{3} = \frac{1}{x} \cdot \frac{1}{x} \cdot \frac{1}{x} = \frac{1 \cdot 1 \cdot 1}{x \cdot x \cdot x} = \frac{1}{x^{3}} \)
\( x^{\frac{1}{n}} = \sqrt[n]{x} \) \( 9^{\frac{1}{2}} = \sqrt{9} = 3 \)
\( x^{\frac{m}{n}} = \sqrt[n]{x^m} = (\sqrt[n]{x})^m \) \( 8^{\frac{2}{3}} = \sqrt[3]{8^2} = \sqrt[3]{64} = 4 \)
Exponentiation in Modular Arithmetic

In cryptography, we often compute exponents in modular arithmetic:

\( a^b \mod n \) means raising \(a\) to power \(b\) and then taking the remainder when divided by \(n\)

Example: \( 3^4 \mod 7 = 81 \mod 7 = 4 \)

The same exponent rules apply in modular arithmetic:

\( (a^m \cdot a^n) \mod n = a^{m+n} \mod n \)

\( (a^m)^n \mod p = a^{m \cdot n} \mod p \)

Importance in Cryptography

These exponent rules are fundamental to many cryptographic algorithms:

  • RSA: Uses the property that \((m^e)^d \mod n = m^{e \cdot d} \mod n\)
  • Diffie-Hellman: Relies on \((g^a)^b \mod p = g^{ab} \mod p = (g^b)^a \mod p\)
  • Fast exponentiation: Used for efficient computation of large powers in cryptographic operations
Computing Large Exponents Efficiently

For cryptographic applications, we need to efficiently compute large exponents. The square-and-multiply algorithm lets us calculate \(x^n\) with only \(O(\log n)\) multiplications:

  1. Convert the exponent to binary representation
  2. Start with result = 1
  3. For each bit in the exponent (from left to right):
    • Square the result
    • If the current bit is 1, multiply by the base

Example: To compute \(3^{13}\), we use binary 13 = 1101:
result = 1
bit 1: result = \( 1^{2} \times 3 = 3 \)
bit 1: result = \( 3^{2} \times 3 = 27 \)
bit 0: result = \( 27^{2} = 729 \)
bit 1: result = \( 729^{2} \times 3 = 1,594,323 \)

You already use modulo computation when you look at a clock and, for example, need to figure out what time it will be 3 hours after 11 o'clock, which is 2 o'clock. In mathematics, we write this as:

\( (11 + 3) \: mod \: 12 = 2 \)

Here, 12 is the modulus because we want the time as an integer between 0 and 11 (in this case, 12 o'clock is denoted by 0). In words, we say "11 plus 3 modulo 12 equals 2." The result of a modulo computation is an integer between 0 and the modulus minus 1. For example, with modulus 3, we have:

  • \( 1 \: mod \: 3 = 1 \)
  • \( 2 \: mod \: 3 = 2 \)
  • \( 3 \: mod \: 3 = 0 \)
  • \( 4 \: mod \: 3 = 1 \)
  • \( 5 \: mod \: 3 = 2 \)
  • \( 6 \: mod \: 3 = 0 \)
  • etc.

For example, if we look at \( 27 \: mod \: 5 \), modulo computes how many times 5 divides into 27 and then returns the remainder, which is 2 in this case. That is, \( 27 \: mod \: 5 = 2 \). But how do we get this result?

First, we compute how many times we can multiply 5 by an integer \( x \) so that the result is as close as possible to 27 without exceeding it. In other words, we find the maximum value of \( x \) such that \( 5 \cdot x \leq 27 \). In this case, \( x = 5 \) because \( 5 \cdot 5 = 25 \leq 27 \). Then, by subtracting 25 from 27, we get the answer: \( 27 - 25 = 2 \).

If the integer is negative, for example \( -27 \: mod \: 5 \), we have to do it slightly differently, and the answer is \( -27 \: mod \: 5 = 3 \). In this case, the integer \( x \) is negative and should be the closest integer such that \( 5 \cdot x \) does not exceed -27. That is, we find the minimum value of \( -x \) such that \( 5 \cdot -x \geq -27 \). Here, \( -x = -6 \) because \( 5 \cdot -6 = -30 \geq -27 \). Then, by subtracting -30 from -27, we get the answer: \( -27 - (-30) = -27 + 30 = 3 \).

It is important that \( x \) or \( -x \) is an integer such as \( -14, 3, 17 \), etc., and NOT a fraction or decimal such as \( \frac{1}{4}, \frac{-3}{7}, 2.5, 5.1 \), etc.

If two integers \( a \) and \( b \) modulo the same modulus \( c \) return the same remainder \( r \), then we say that \( a \) and \( b \) are congruent modulo \( c \). That is, if \( a \: mod \: c = r \) and \( b \: mod \: c = r \), then \( a \equiv b \: (mod \: c) \). Also, notice that if the modulus \( c \) is greater than the integer \( a \), i.e., \( c > a \), the result will always be \( a \: mod \: c = a \).

Properties of Modular Arithmetic

Modular arithmetic has several important properties that make it useful in cryptography:

  • Addition: \((a + b) \: \text{mod} \: n = ((a \: \text{mod} \: n) + (b \: \text{mod} \: n)) \: \text{mod} \: n\)
  • Subtraction: \((a - b) \: \text{mod} \: n = ((a \: \text{mod} \: n) - (b \: \text{mod} \: n)) \: \text{mod} \: n\)
  • Multiplication: \((a \cdot b) \: \text{mod} \: n = ((a \: \text{mod} \: n) \cdot (b \: \text{mod} \: n)) \: \text{mod} \: n\)
  • Exponentiation: \(a^b \: \text{mod} \: n = (a \: \text{mod} \: n)^b \: \text{mod} \: n\)

These properties allow us to perform calculations on smaller values rather than potentially very large ones, which is essential in cryptographic algorithms.

Modular Inverse

A modular inverse of an integer \(a\) with respect to modulus \(n\) is an integer \(a^{-1}\) such that:

\(a \cdot a^{-1} \equiv 1 \: (\text{mod} \: n)\)

The modular inverse exists if and only if \(a\) and \(n\) are coprime (i.e., their greatest common divisor is 1). Finding modular inverses is crucial in cryptography, particularly in algorithms like RSA where we need to find \(d\) such that \(e \cdot d \equiv 1 \: (\text{mod} \: \phi(n))\).

Important Theorems in Modular Arithmetic

Fermat's Little Theorem: If \(p\) is a prime number and \(a\) is not divisible by \(p\), then \(a^{p-1} \equiv 1 \: (\text{mod} \: p)\).

Euler's Theorem: If \(a\) and \(n\) are coprime, then \(a^{\phi(n)} \equiv 1 \: (\text{mod} \: n)\), where \(\phi(n)\) is Euler's totient function.

These theorems are fundamental to understanding and implementing many cryptographic algorithms, especially RSA.

Applications in Cryptography

Modular arithmetic is fundamental to many cryptographic algorithms:

  • RSA: Relies on modular exponentiation for encryption and decryption
  • Diffie-Hellman: Uses modular exponentiation for key exchange
  • ElGamal: Employs modular arithmetic for public-key encryption
  • Digital Signatures: Many signature schemes use modular operations
  • Hash Functions: Often incorporate modular operations in their compression functions

A prime number is an integer greater than 1 that can only be divided evenly by 1 and itself. "Divided evenly" means the result is an integer, not a float. For example, if you divide 13 by 3, you get the float \( \frac{13}{3} = 4.333 \). We see that 13 is a prime number because it can only be divided evenly by 1 and itself: \( \frac{13}{1} = 13 \) and \( \frac{13}{13} = 1 \).

If an integer is not a prime number, it is called a composite number. For example, the integer 6 is a composite number because it can be divided evenly by 1, 2, 3, and 6:

\( \frac{6}{1} = 6 \), \( \frac{6}{2} = 3 \), \( \frac{6}{3} = 2 \), and \( \frac{6}{6} = 1 \)

The term "composite" means "something made by combining things." So, a composite number is an integer made by multiplying prime numbers:

  • 2 is a prime
  • 3 is a prime
  • 4 is composite: \( 2 \cdot 2 = 4 \)
  • 5 is a prime
  • 6 is composite: \( 2 \cdot 3 = 6 \)
  • 7 is a prime
  • 8 is composite: \( 2 \cdot 2 \cdot 2 = 8 \)
  • 9 is composite: \( 3 \cdot 3 = 9 \)
  • etc.

Therefore, every integer greater than 1 is either a prime number or a product of prime numbers (a composite number).

The famous Greek mathematician Euclid proved that there are infinitely many prime numbers.

Michael Rabin and Gary Miller developed an algorithm that determines whether an integer is a prime or a composite number by testing the integer with multiple bases, denoted by \( a \). The algorithm is called the Rabin-Miller primality test.

The Fundamental Theorem of Arithmetic

Every integer greater than 1 can be expressed as a unique product of prime numbers (up to the order of the factors). This fundamental property makes prime numbers the "building blocks" of all natural numbers and is crucial for many cryptographic applications.

Importance in Cryptography

Prime numbers form the foundation of many cryptographic systems:

  • RSA encryption relies on the difficulty of factoring the product of two very large primes
  • Diffie-Hellman key exchange uses prime numbers to create secure shared secrets
  • Elliptic curve cryptography often works over finite fields of prime order
  • Digital signature algorithms like DSA and ECDSA use primes in their mathematical operations
Primality Testing

Determining whether a number is prime is crucial in cryptography. Several methods exist:

  • Trial division: Dividing by all integers up to the square root (inefficient for large numbers)
  • Fermat primality test: Based on Fermat's Little Theorem, but can be fooled by Carmichael numbers
  • Miller-Rabin test: A probabilistic algorithm that can determine primality with high confidence
  • AKS primality test: The first deterministic polynomial-time algorithm for primality testing

The Miller-Rabin test is particularly useful because it's efficient and can achieve any desired level of certainty by increasing the number of test rounds.

Special Types of Primes in Cryptography
  • Safe primes: Primes p where (p-1)/2 is also prime. Used in Diffie-Hellman key exchange.
  • Strong primes: Primes with certain properties that make them resistant to specific factoring algorithms, often used in RSA.
  • Sophie Germain primes: Primes p where 2p+1 is also prime, useful for cryptographic applications.
  • Mersenne primes: Primes of the form 2n-1, which allow for efficient arithmetic operations.
Definition

The greatest common divisor (GCD) of two or more integers is the largest positive integer that divides each of the given integers without a remainder. It is denoted as \(\gcd(a, b)\) or sometimes as \((a, b)\).

Formally, for two integers \(a\) and \(b\), not both zero, \(\gcd(a,b)\) is the largest positive integer \(d\) such that \(d|a\) and \(d|b\) (where \(d|a\) means "\(d\) divides \(a\)" or "\(a\) is divisible by \(d\)").

Before we describe what the greatest common divisor of two integers is, we first define what we mean by a divisor. In this context, a divisor of an integer \( x \) is any integer that divides \( x \) evenly, meaning the result is not a fraction. For example, if you divide \( x=12 \) by 5, you get the fraction \( \frac{12}{5} = 2.4 \), so 5 is not a divisor of \( x=12 \). For \( x=12 \), the divisors are 1, 2, 3, 4, 6, and 12 because \( \frac{12}{1} = 12 \), \( \frac{12}{2} = 6 \), \( \frac{12}{3} = 4 \), \( \frac{12}{4} = 3 \), \( \frac{12}{6} = 2 \), and \( \frac{12}{12} = 1 \).

Similarly, the divisors of 16 are 1, 2, 4, 8, and 16 because \( \frac{16}{1} = 16 \), \( \frac{16}{2} = 8 \), \( \frac{16}{4} = 4 \), \( \frac{16}{8} = 2 \), and \( \frac{16}{16}=1 \).

The greatest common divisor of 12 and 16 is therefore 4, because it is the largest integer among their common divisors. In mathematics, we write this as \( \gcd(12, 16) = 4 \).

Two integers whose greatest common divisor is 1 are called relatively prime numbers or co-primes. For example, 15 and 28 are relatively prime because \( \gcd(15, 28) = 1 \) (note that 28 is not a prime number).

If one of the two integers is a prime number, the greatest common divisor will always be 1, i.e., \( \gcd(a, p) = 1 \), where \( a \) is any integer (either prime or composite) and \( p \) is a prime number.

Key Properties
  • \(\gcd(a,b) = \gcd(b,a)\) (Commutativity)
  • \(\gcd(a,b) = \gcd(a-b,b)\) when \(a \geq b\) (Subtraction property)
  • \(\gcd(a,b) = \gcd(a \bmod b, b)\) when \(b \neq 0\) (Basis of the Euclidean algorithm)
  • \(\gcd(a,0) = |a|\) for any \(a\) (Base case)
  • If \(p\) is prime and \(p\) doesn't divide \(a\), then \(\gcd(a,p) = 1\)

One method to compute the greatest common divisor of two integers is by using the Euclidean algorithm, developed by the famous Greek mathematician Euclid. See "The extended Euclidean algorithm" for more information about how to compute the greatest common divisor of two integers.

The Euclidean Algorithm

The Euclidean algorithm is an efficient method to find the GCD of two integers. It works by repeatedly applying the division algorithm:

  1. Start with two positive integers \(a\) and \(b\), where \(a \geq b\)
  2. Divide \(a\) by \(b\) to get quotient \(q\) and remainder \(r\) such that \(a = bq + r\) where \(0 \leq r < b\)
  3. If \(r = 0\), then \(\gcd(a,b) = b\)
  4. Otherwise, set \(a = b\) and \(b = r\), and repeat from step 2

Example: Find \(\gcd(48, 18)\)

  • \(48 = 18 \cdot 2 + 12\)
  • \(18 = 12 \cdot 1 + 6\)
  • \(12 = 6 \cdot 2 + 0\)

Since the remainder is now 0, \(\gcd(48, 18) = 6\).

Applications in Cryptography

The GCD has several important applications in cryptography:

  • Key Generation: In RSA, we need to find two numbers \(e\) and \(d\) such that \(ed \equiv 1 \pmod{\phi(n)}\), which requires \(\gcd(e, \phi(n)) = 1\)
  • Primality Testing: Many primality tests use GCD calculations
  • Modular Arithmetic: Finding modular inverses (via the extended Euclidean algorithm) requires co-prime numbers
  • Factorization Attacks: Some attacks on cryptosystems use GCD calculations to factor large numbers
Definition and Purpose

The extended Euclidean algorithm is an extension of the Euclidean algorithm, which only returns the greatest common divisor of two integers. Given two integers \( a \) and \( b \), the extended Euclidean algorithm returns the integers \( a \), \( b \), \( \lambda \), and \( \mu \) such that:

\( a \cdot \lambda + b \cdot \mu = \gcd(a, b) \)

Here, \( \lambda \) and \( \mu \) are called the Bézout coefficients for \( a \) and \( b \). Only if \( a \) and \( b \) are relatively prime, i.e., \( \gcd(a, b) = 1 \), then:

\( a \cdot \lambda + b \cdot \mu = 1 \)

In this case, \( \lambda \; mod \; b \) is the inverse of \( a \), denoted \( a^{-1} = \lambda \; mod \; b \), and \( \mu \: mod \: a \) is the inverse of \( b \), denoted \( b^{-1} = \mu \: mod \: a \) (see "Modulo computation" for more information about the \( mod \) operator). One useful property of an integer and its inverse is that \( a \cdot a^{-1} \; mod \; b = 1 \) and \( b \cdot b^{-1} \; mod \; a = 1 \).

Example

You can easily compute \( \gcd(a, b) \), \( \lambda \), and \( \mu \) for example with \( a=5 \) and \( b=39 \) using a simple table. First, let us create a table with three columns (we do not yet know how many rows there will be). Let us denote the entry in the first row and first column as [1,1], the entry in the first row and second column as [1,2], the entry in the second row and first column as [2,1], and so on.

Next, we write \( b=39 \) in entry [1,1] and \( a=5 \) in entry [2,1]. Then we try to find the largest integer \( q_{1} \) such that \( q_{1} \cdot a \leq b \). We have \( q_{1}=7 \), which we write in entry [2,2], because \( 7 \cdot 5 = 35 \leq 39 \), and a remainder of \( r_{1}=4 \), which we write in entry [3,1].

Again, we try to find the largest integer \( q_{2} \) such that \( q_{2} \cdot r_{1} \leq a \). We have \( q_{2}=1 \), which we write in entry [3,2], because \( 1 \cdot 4 = 4 \leq 5 \), and a remainder of \( r_{2}=1 \), which we write in entry [4,1]. Notice that we are repeating the same process as before, just with the numbers in the next row.

The next computation returns a remainder of \( r_{3} = 0 \) because \( q_{3} \cdot r_{2} = 4 \cdot 1 = 4 \leq 4 = r_{1} \). We have now computed \( \gcd(5, 39)=r_{2}=1 \) since \( r_{3} = 0 \). Because 5 and 39 are relatively prime, we know that \( \lambda \) and \( \mu \) exist, and we can start using the last column.

First, we write \( x_{1}=0 \) in entry [4,3] and \( x_{2}=1 \) in entry [3,3]. Then we write \( x_{3}=q_{2} \cdot x_{2} + x_{1} = 1 \cdot 1 + 0 = 1 \) in entry [2,3]. For entry [1,3], we compute as before, just with numbers from the row above, i.e., \( x_{4}=q_{1} \cdot x_{3} + x_{2} = 7 \cdot 1 + 1 = 8 \).

Finally, we have that \( a \cdot x_{4} \pm b \cdot x_{3} = r_{2} \), where we need to decide whether it should be plus or minus between the two terms. Because \( a \cdot x_{4} = 5 \cdot 8 = 40 \), \( b \cdot x_{3} = 39 \cdot 1 \), and \( 40 \geq 39 \), we have \( 5 \cdot 8 - 39 \cdot 1 = 1 \) (which is the same as \( 5 \cdot 8 + 39 \cdot (-1) = 1 \)), so the Bézout coefficients are \( \lambda=8 \) and \( \mu=-1 \). Notice that \( a^{-1} = \lambda \; mod \; b = 8 \; mod \; 39 = 8\) and \( b^{-1} = \mu \; mod \; a = -1 \: mod \: 5 = 4\), where \( a \cdot a^{-1} \; mod \; b = 5 \cdot 8 \; mod \; 39 = 1 \) and \( b \cdot b^{-1} \; mod \; a = 39 \cdot 4 \; mod \; 5 = 1 \).

Table for the example

The table for computing \( 5 \cdot \lambda + 39 \cdot \mu = \gcd(5, 39) \) is:

\( b=39 \) \( x_{4}=8 \)
\( a=5 \) \( q_{1}=7 \) \( x_{3}=1 \)
\( r_{1}=4 \) \( q_{2}=1 \) \( x_{2}=1 \)
\( r_{2}=1 \) \( q_{3}=4 \) \( x_{1}=0 \)
\( r_{3}=0 \)
Algorithm in Pseudocode
function extended_gcd(a, b)
    if b = 0 then
        return (a, 1, 0)
    else
        (d, x', y') = extended_gcd(b, a mod b)
        return (d, y', x' - (a div b) * y')

The function returns a tuple (d, x, y) such that d = gcd(a, b) and d = ax + by.

Applications in Cryptography
  • RSA Key Generation: Computing the private key d where d ≡ e-1 (mod φ(n)) requires finding the modular inverse of the public exponent e
  • Chinese Remainder Theorem: Finding solutions to systems of modular congruences, which enables efficient RSA implementations
  • Elliptic Curve Cryptography: Computing point addition and scalar multiplication operations
  • Digital Signatures: Many schemes require modular inverse calculations
  • Key Exchange Protocols: Used in various secure communication protocols
Relation to the Basic Euclidean Algorithm

While the standard Euclidean algorithm computes only the GCD through successive divisions, the extended version keeps track of the coefficients at each step. This "extension" allows us to express the GCD as a linear combination of the original numbers, which is crucial for finding modular inverses.

Computational Complexity

The extended Euclidean algorithm has the same time complexity as the standard Euclidean algorithm: O(log min(a,b)). This makes it highly efficient even for very large integers, which is crucial for its application in cryptographic systems.

In abstract algebra, a ring is a set with two operations (usually addition and multiplication) where addition forms an abelian group and multiplication is associative and distributes over addition. A group is a set with a single operation that satisfies closure, associativity, identity, and inverse properties.

Integers and Modular Arithmetic

The set of integers \( \{ \dots, -2, -1, 0, 1, 2, \dots \} \) is denoted by the symbol \( \mathbb{Z} \), i.e., \( \mathbb{Z} = \{ \dots, -2, -1, 0, 1, 2, \dots \} \), and is called the ring of integers or an additive group. The group of integers modulo \( n \) is a subset of \( \mathbb{Z} \) and is denoted by \( \mathbb{Z}/n\mathbb{Z} \), but we use the shorthand \( \mathbb{Z}_{n} \). This subset contains the following elements (because we compute modulo \( n \)):

\( \mathbb{Z}_{n} = \{ 0, 1, 2, \dots, n - 1 \}\)

Notice that whenever we perform addition or multiplication in \( \mathbb{Z}_{n} \), we always compute the result modulo \( n \) to obtain an integer in \( \mathbb{Z}_{n} \). To illustrate this, let us look at \( n = 5 \):

\( \mathbb{Z}_{5} = \{ 0, 1, 2, 3, 4 \}\)

When adding \( 3 \) and \( 4 \) in \( \mathbb{Z}_{5} \), we do the following: \( (3 + 4) \: mod \: 5 = 7 \: mod \: 5 = 2 \). Likewise, when multiplying \( 3 \) by \( 4 \) in \( \mathbb{Z}_{5} \), we have: \( (3 \cdot 4) \: mod \: 5 = 12 \: mod \: 5 = 2 \).

Units and Multiplicative Groups

An integer \( a \) in \( \mathbb{Z}_{n} \) has an inverse if the greatest common divisor of \( a \) and \( n \) is 1, i.e., \( \gcd(a, n) = 1 \) (see "The extended Euclidean algorithm" for more information). The integer \( a \) is then called a unit. The set of units (all integers with an inverse in \( \mathbb{Z}_{n} \)) is denoted by \( (\mathbb{Z}/n\mathbb{Z})^{*} \). Again, we use the shorthand \( \mathbb{Z}_{n}^{*} \). If \( a_{1} \) and \( a_{2} \) are units, then their product \( (a_{1} \cdot a_{2}) \: mod \: n \) is also always a unit (i.e., \( a_{1} \cdot a_{2} \) is in the group \( \mathbb{Z}_{n}^{*} \)), but the sum \( (a_{1} + a_{2}) \: mod \: n \) may NOT be a unit (i.e., \( a_{1} + a_{2} \) is in \( \mathbb{Z}_{n} \) but may not be in \( \mathbb{Z}_{n}^{*} \)). We see the difference in the two sets \( \mathbb{Z}_{8} \) and \( \mathbb{Z}_{8}^{*} \):

  • \( \mathbb{Z}_{8} = \{ 0, 1, 2, 3, 4, 5, 6, 7 \}\)
  • \( \mathbb{Z}_{8}^{*} = \{ 1, 3, 5, 7 \} \)

If we choose \( n \) to be a prime number \( p \), then for all integers \( a \) except 0 in \( \mathbb{Z}_{p} \), we have \( \gcd(a, p) = 1 \), which means that \( \mathbb{Z}_{p}^{*} \) contains all integers from \( \mathbb{Z}_{p} \) except 0, i.e.:

\( \mathbb{Z}_{p}^{*} = \{ 1, 2, 3, \dots, p - 1 \}\)

For example, for \( p=7 \), the only difference between the two sets \( \mathbb{Z}_{7} \) and \( \mathbb{Z}_{7}^{*} \) is the integer 0:

  • \( \mathbb{Z}_{7} = \{ 0, 1, 2, 3, 4, 5, 6 \}\)
  • \( \mathbb{Z}_{7}^{*} = \{ 1, 2, 3, 4, 5, 6 \} \)
Euler's Phi Function

The number of elements in \( \mathbb{Z}_{n}^{*} \) is denoted by \( \phi(n) \), named after the famous Swiss mathematician Euler, and is called Euler's phi function. As we saw previously, if \( n \) is a prime number \( p \), then \( \phi(p) = p-1 \). The number of elements in a group \( G \) is also called the order of \( G \) and is written as \( \left| G \right| \). The order of:

  • \( \mathbb{Z}_{n} \) is \( \left| \mathbb{Z}_{n} \right| = n \) where \( n \) is a composite number
  • \( \mathbb{Z}_{n}^{*} \) is \( \left| \mathbb{Z}_{n}^{*} \right| = \phi(n) \) where \( n \) is a composite number
  • \( \mathbb{Z}_{n}^{*} \) is \( \left| \mathbb{Z}_{n}^{*} \right| = \phi(n) = (p-1) \cdot (q-1) \) where \( n = p \cdot q \) and \( p \) and \( q \) are prime numbers
  • \( \mathbb{Z}_{p}^{*} \) is \( \left| \mathbb{Z}_{p}^{*} \right| = \phi(p) = p-1 \) where \( p \) is a prime number

Euler's phi function \(\phi(n)\) can be computed efficiently when the prime factorization of \(n\) is known:

For a prime power \(p^k\): \(\phi(p^k) = p^k - p^{k-1} = p^k(1 - \frac{1}{p})\)

For a product of prime powers: \(\phi(n) = n \prod_{p|n} (1 - \frac{1}{p})\)

For example, \(\phi(36) = \phi(2^2 \cdot 3^2) = 36 \cdot (1 - \frac{1}{2}) \cdot (1 - \frac{1}{3}) = 36 \cdot \frac{1}{2} \cdot \frac{2}{3} = 12\)

Modular Exponentiation Rules

If, for example, we choose the elements \( x \), \( a \), and \( b \) from the group \( \mathbb{Z}_{p}^{*} \) and want to compute \( x^{a + b} \: mod \: p \), then in the exponent we actually compute \( a + b \) modulo the order of the group, i.e., \( a + b \: mod \: (p-1) \) because \( \left| \mathbb{Z}_{p}^{*} \right| = \phi(p) = p-1 \). So, what we actually compute is \( x^{a + b \: mod \: (p-1)} \: mod \: p \). The same is true if we had chosen one of the other groups: we always compute modulo the order of the group in the exponent. Therefore, next time you look at an equation from a cryptosystem and wonder why they suddenly compute, for example, modulo \( p-1 \) instead of modulo \( p \), it is because the equation is used in the exponent of some integer.

Units are particularly important in cryptography because they have multiplicative inverses, allowing operations like division in modular arithmetic. The existence of these inverses is crucial for many cryptographic algorithms, including RSA, where we need to find values \(e\) and \(d\) such that \(e \cdot d \equiv 1 \pmod{\phi(n)}\).

Important Theorems

Fermat's Little Theorem: If \(p\) is a prime number and \(a\) is not divisible by \(p\), then \(a^{p-1} \equiv 1 \pmod{p}\).

Euler's Theorem: For any coprime integers \(a\) and \(n\), \(a^{\phi(n)} \equiv 1 \pmod{n}\). This generalizes Fermat's Little Theorem and forms the mathematical foundation of the RSA cryptosystem.

Applications in Cryptography
  • RSA: Uses the fact that \(m^{ed} \equiv m \pmod{n}\) when \(ed \equiv 1 \pmod{\phi(n)}\)
  • Diffie-Hellman: Operates in the multiplicative group \(\mathbb{Z}_p^*\) where \(p\) is prime
  • ElGamal: Relies on modular exponentiation in \(\mathbb{Z}_p^*\)
  • Digital Signatures: Many schemes use properties of these groups for verification
What is a Generator?

A generator (or primitive root) of a group is an element whose powers can produce every element in the group. In other words, if g is a generator of a group G, then by computing all possible powers of g, you will eventually generate all elements in G.

In the following, we use the group \( \mathbb{Z}_{p} \) as an example for simplicity, but any group could be chosen.

There exists an element \( g \) in the group of integers \( \mathbb{Z}_{p} \) (see "The group of integers and units" for more information), where \( p \) is a prime number, whose powers generate every element in the set of units \( \mathbb{Z}_{p}^{*} \). That is, there exists an integer \( g \) in \( \mathbb{Z}_{p} \) such that:

\( \mathbb{Z}_{p}^{*} = \{ g^{1} \: mod \: p, g^{2} \: mod \: p, g^{3} \: mod \: p, \dots, g^{p-1} \: mod \: p \} \)

where \( g^{p-1} \: mod \: p = 1 \) because \( p-1 \: mod \: (p-1) = 0 \) (remember that the order of \( \mathbb{Z}_{p}^{*} \) is \( p-1 \), and we compute modulo the order of the group in the exponent of \( g \) because \( g \) belongs to the group \( \mathbb{Z}_{p}^{*} \)), and \( g^{0} = 1 \). Such a \( g \) is called a generator or a primitive root of the group, and it's denoted as \( \left< g \right> = \mathbb{Z}_{p}^{*} \). We also say that the order of \( g \) is \( ord(g) = p-1 \) because \( g \) generates the group \( \mathbb{Z}_{p}^{*} \) with \( p-1 \) elements.

To clarify, let's look at an example with the prime number 5. Recall that \( \mathbb{Z}_{5}^{*} = \{ 1, 2, 3, 4 \} \) (see "The group of integers and units" for the example). The group \( \mathbb{Z}_{5}^{*} \) has 2 as a generator because the powers of 2 generate every element in the group, i.e., \( \left< 2 \right> = \mathbb{Z}_{5}^{*} \):

  • \( 2^{1} \: mod \: 5 = 2 \)
  • \( 2^{2} \: mod \: 5 = 4 \)
  • \( 2^{3} \: mod \: 5 = 3 \)
  • \( 2^{4} \: mod \: 5 = 1 \)

The last power is 4 because \( p - 1 = 5 - 1 = 4 \). On the other hand, 4 is NOT a generator of \( \mathbb{Z}_{5}^{*} \) because the powers of 4 only generate the integers 1 and 4:

  • \( 4^{1} \: mod \: 5 = 4 \)
  • \( 4^{2} \: mod \: 5 = 1 \)
  • \( 4^{3} \: mod \: 5 = 4 \)
  • \( 4^{4} \: mod \: 5 = 1 \)

If you don't know the factorization of the integer \( p-1 \), then the only way to find a generator is to perform the above computation, i.e., check that it generates every element in the group \( \mathbb{Z}_{p}^{*} \). However, if you know the factorization of \( p-1 \), then for every prime number \( q \) that evenly divides \( p-1 \), we check that \( g^{(p-1)/q} \: mod \: p \neq 1 \) for a random integer \( g \) in the group \( \mathbb{Z}_{p} \). If this is the case, then \( g \) is a generator of \( \mathbb{Z}_{p}^{*} \).

Because the prime factorization problem is hard (the problem of computing the factorization of the integer \( p-1 \)), we use a so-called safe prime \( p \). A safe prime \( p \) is of the form \( p = 2 \cdot q + 1 \) where \( q \) is a prime number. Now, the factorization of \( p-1 \) is always only \( 2 \) and \( q \). For example, if we compute the safe prime \( p = 2 \cdot 5 + 1 = 11 \) with the prime number \( q = 5 \), we see that \( g = 7 \) is a generator of the group \( \mathbb{Z}_{11}^{*} \) because:

  • \( 7^{(11-1)/2} \: mod \: 11 = 10 \neq 1 \)
  • \( 7^{(11-1)/5} \: mod \: 11 = 5 \neq 1 \)

where 2 and \( q=5 \) are the only prime numbers that divide \( p-1=11-1=10 \) evenly. For the same reason, \( g = 5 \) is not a generator of \( \mathbb{Z}_{11}^{*} \) because:

  • \( 5^{(11-1)/2} \: mod \: 11 = 1 \)
  • \( 5^{(11-1)/5} \: mod \: 11 = 3 \neq 1 \)

The group \( \mathbb{Z}_{n}^{*} \), where \( n \) is a composite number, may not have a generator, but if \( n \) is a prime number \( p \), then the group \( \mathbb{Z}_{p}^{*} \) has at least one generator.

Number of Generators

For a prime p, the number of generators (primitive roots) in \(\mathbb{Z}_p^*\) is exactly \(\phi(\phi(p)) = \phi(p-1)\), where \(\phi\) is Euler's totient function. This means that generators are relatively common in prime-order groups.

For example, in \(\mathbb{Z}_{11}^*\), there are \(\phi(10) = 4\) generators: 2, 6, 7, and 8.

Importance in Cryptography

Generators are fundamental to many cryptographic protocols:

  • Diffie-Hellman key exchange relies on working in a group with a known generator
  • ElGamal encryption requires a generator to create public-private key pairs
  • Digital Signature Algorithm (DSA) uses generators for signing and verification
  • The security of these protocols depends on the difficulty of the discrete logarithm problem in the group generated by g
Subgroups and Cyclic Groups

When an element g is not a generator of the entire group, it generates a subgroup. The order of this subgroup will be the smallest positive integer k such that \(g^k = 1\).

For example, in \(\mathbb{Z}_5^*\), the element 4 generates the subgroup \(\{1, 4\}\) of order 2, while the element 2 generates the entire group of order 4.

A group that can be generated by a single element is called a cyclic group. All groups \(\mathbb{Z}_p^*\) where p is prime are cyclic groups.

Let \( g \) be a generator of the group \( G \). Given the values \( g \) and \( H \), the discrete logarithm (DL) problem is to compute the exponent \( a \) in the following equation, which is considered a hard problem:

\( g^{a} = H \)

The exponent \( a \) is also called the discrete logarithm of \( H \) to the base \( g \).

For example, if we use the group \( \mathbb{Z}_{p}^{*} \), where \( p \) is a large prime number and \( g \) is a generator of the group, then given \( g \), \( p \), and \( H \), it is hard to compute \( a \) such that the following equation holds:

\( g^{a} \: mod \: p = H \)

In addition to the DL problem, there are two related problems: the Diffie-Hellman (DH) problem and the Decisional Diffie-Hellman (DDH) problem. Given the values \( g \), \( g^{a} \), and \( g^{b} \), the DH problem is to compute the exponent \( a \cdot b \) in \( g^{a \cdot b} \).

Similarly, given the values \( g \), \( g^{a} \), \( g^{b} \), and \( g^{c} \), the DDH problem is to decide whether \( c = a \cdot b \) or whether \( c \) is a random integer.

You may have noticed that if we can solve the DL problem, that is, compute \( a \) in \( g^{a} = H \), then we can also solve the DH problem: first compute \( a \) from \( g^{a} \), then \( b \) from \( g^{b} \), and finally calculate \( a \cdot b \). This also implies that we can solve the DDH problem: first compute \( a \cdot b \) as described above, then compute \( c \) from \( g^{c} \), and finally check whether \( c = a \cdot b \).

Computational Hardness

The discrete logarithm problem is believed to be computationally hard when proper parameters are chosen. For \( \mathbb{Z}_{p}^{*} \), the problem becomes harder as the size of the prime \( p \) increases, especially when \( p-1 \) has at least one large prime factor. The best known classical algorithms (Number Field Sieve) run in sub-exponential time, making the problem practically intractable for large enough parameters.

Simple Example

Consider a small example in \( \mathbb{Z}_{11}^{*} \) with generator \( g = 2 \):

  • \( 2^1 \mod 11 = 2 \)
  • \( 2^2 \mod 11 = 4 \)
  • \( 2^3 \mod 11 = 8 \)
  • \( 2^4 \mod 11 = 5 \)
  • \( 2^5 \mod 11 = 10 \)
  • \( 2^6 \mod 11 = 9 \)
  • \( 2^7 \mod 11 = 7 \)
  • \( 2^8 \mod 11 = 3 \)
  • \( 2^9 \mod 11 = 6 \)
  • \( 2^{10} \mod 11 = 1 \)

If given \( H = 9 \), finding \( a \) such that \( 2^a \mod 11 = 9 \) means finding \( a = 6 \). For small groups like this, we can compute all powers, but for cryptographically secure groups, this approach becomes computationally infeasible.

Cryptographic Applications

The hardness of these problems forms the security foundation for many cryptographic protocols:

  • Diffie-Hellman Key Exchange: Based on the DH problem, allows two parties to establish a shared secret over an insecure channel
  • ElGamal Encryption: Security relies on the DDH problem
  • Digital Signature Algorithm (DSA): Uses the discrete logarithm problem for signature generation and verification
  • Zero-knowledge proofs: Many constructions rely on the hardness of the discrete logarithm
Beyond \( \mathbb{Z}_{p}^{*} \)

While the examples above use the multiplicative group of integers modulo a prime, these problems are also defined in other groups:

  • Elliptic curve groups: The Elliptic Curve Discrete Logarithm Problem (ECDLP) is the basis for elliptic curve cryptography, offering similar security with smaller key sizes
  • Ideal class groups: Used in some post-quantum cryptographic proposals
Algorithms for Solving DLP

Several algorithms exist for solving the discrete logarithm problem:

  • Baby-step Giant-step: A time-memory tradeoff with \( O(\sqrt{n}) \) complexity
  • Pollard's Rho: A probabilistic algorithm with similar performance but lower memory requirements
  • Index Calculus: Sub-exponential time for certain groups
  • Shor's Algorithm: A quantum algorithm that can solve the DLP in polynomial time, posing a threat to systems based on this problem if large-scale quantum computers become available

The cryptography explained

The Diffie-Hellman key exchange algorithm was first published in 1976 by Whitfield Diffie and Martin Hellman, although the algorithm had been invented a few years earlier by the British government intelligence agency GCHQ but was kept classified. In 2002, Martin Hellman suggested that the algorithm be renamed to "The Diffie-Hellman-Merkle key exchange" in recognition of Ralph Merkle's contribution to public-key cryptography.

The Diffie-Hellman key exchange algorithm solves the following problem: Alice and Bob want to share a secret key for, e.g., a symmetric key algorithm such as DES or AES, but they can only communicate through an insecure channel that is eavesdropped on by their adversary Eve. That is, all messages sent between Alice and Bob are observed by Eve.

The group used in the Diffie-Hellman key exchange can either be \( \mathbb{Z}_{p}^{*} \) where \( p \) is a prime number, a subgroup of \( \mathbb{Z}_{p}^{*} \) of order \( q \) where \( q \) is a prime number, or an elliptic curve group, but in what follows we use the group \( \mathbb{Z}_{p}^{*} \) for simplicity.

The first step in the algorithm is for Alice and Bob to agree on a large prime number \( p \) and a generator \( g \) of the group \( \mathbb{Z}_{p}^{*} \) (in practice it may be a trusted third party that chooses \( p \) and \( g \)). The values of \( p \) and \( g \) are public, i.e., Eve knows them too.

In the next step, Alice and Bob each pick a secret value. Alice picks the secret value \( a \) and Bob picks the secret value \( b \) where both \( a \) and \( b \) are numbers between \( 1 \) and \( p-1 \). Alice and Bob use their secret values to compute the public values \( A \) and \( B \) where Alice computes \( A = g^{a} \: mod \: p\) and Bob computes \( B = g^{b} \: mod \: p\). Alice then sends \( A \) to Bob and Bob sends \( B \) to Alice. Notice that Eve also sees these two values.

Finally, Alice and Bob use the four values \( a \), \( b \), \( A \) and \( B \) to compute their shared secret key whose value is known only to Alice and Bob. Alice computes the shared secret key by \( A' = B^{a} \: mod \: p \) and Bob computes it by \( B' = A^{b} \: mod \: p \). The values of \( A' \) and \( B' \) are the same because:

\( \eqalign{ A' &= B^{a} \: mod \: p &&(B = g^{b}) \\ &= (g^{b})^{a} \: mod \: p &&(\mbox{exponent rule}) \\ &= g^{a \cdot b} \: mod \: p &&(\mbox{exponent rule}) \\ &= (g^{a})^{b} \: mod \: p &&(A = g^{a}) \\ &= A^{b} \: mod \: p \\ &= B' } \)

If Eve wants to know the value of the shared secret key, she has to find the secret value \( a \) or \( b \) by solving the discrete logarithm problem, i.e., computing \( a \) or \( b \) from the equation \( A = g^{a} \: mod \: p \) or \( B = g^{b} \: mod \: p \), which is hard when the prime number \( p \) is large.

Modern Applications

Diffie-Hellman is widely used in modern protocols including:

  • TLS/SSL - Securing web connections
  • SSH - Secure terminal connections
  • IPsec - VPN technologies
  • Signal Protocol - End-to-end encrypted messaging
Elliptic Curve Diffie-Hellman (ECDH)

ECDH is a variant of the Diffie-Hellman key exchange that uses elliptic curve cryptography instead of the multiplicative group \( \mathbb{Z}_{p}^{*} \). In ECDH, Alice and Bob exchange points on an elliptic curve rather than integers in a modular group.

The core advantage of ECDH is that it provides equivalent security to traditional Diffie-Hellman while using significantly smaller key sizes. For example, a 256-bit elliptic curve key provides comparable security to a 3072-bit traditional DH key. This efficiency makes ECDH particularly valuable for mobile and IoT applications with constrained resources.

The mathematical security of ECDH relies on the Elliptic Curve Discrete Logarithm Problem (ECDLP), which is believed to be even harder to solve than the traditional discrete logarithm problem for equivalent security levels.

Perfect Forward Secrecy

Perfect Forward Secrecy (PFS) is a property of secure communication protocols where the compromise of long-term keys does not compromise past session keys. In practical terms, this means that even if an adversary records all encrypted communications and later obtains the private keys, they cannot decrypt previously recorded messages.

Diffie-Hellman naturally provides this property when used with ephemeral (temporary) keys that are generated for each session and then discarded after use. This implementation is known as "Ephemeral Diffie-Hellman" (DHE or ECDHE for the elliptic curve variant).

PFS is now considered essential in modern security protocols like TLS 1.3, where ephemeral Diffie-Hellman key exchanges are mandatory for all connections.

Key Derivation Functions

In practice, the raw shared secret \( g^{ab} \: mod \: p \) produced by Diffie-Hellman is not used directly as a cryptographic key. Instead, it's passed through a Key Derivation Function (KDF) to produce one or more cryptographically strong keys.

A KDF typically takes the shared secret along with other inputs (such as a salt and context-specific information) and applies cryptographic hash functions repeatedly to derive keys of appropriate length and strength for specific purposes like encryption, message authentication, or initialization vectors.

Common KDFs include HKDF (based on HMAC), PBKDF2 (designed for password-based keys), and the TLS PRF (Pseudo-Random Function used in the TLS protocol). Using a KDF adds an additional layer of security by ensuring the derived keys have appropriate cryptographic properties even if there are some weaknesses in the original shared secret.

The protocol

Public parameters
A trusted third party publish a large prime \( p \) and a generator \( g \) of the group \( \mathbb{Z}_{p}^{*} \).
Computations of the public values
Alice:
  1. Chooses the secret value \( 1 \leq a \leq p-1 \).
  2. Computes the public value \( A = g^{a} \: mod \: p\).
Bob:
  1. Chooses the secret value \( 1 \leq b \leq p-1 \).
  2. Computes the public value \( B = g^{b} \: mod \: p\).
Exchange of values
Alice sends \( A \) to Bob and Bob sends \( B \) to Alice.
Computations of the shared secret key
Alice:
  1. Receives \( B \) from Bob.
  2. Computes the shared secret \( A' = B^{a} \: mod \: p \).
Bob:
  1. Receives \( A \) from Alice.
  2. Computes the shared secret \( B' = A^{b} \: mod \: p \).

Try a demo of the protocol here.

Example

A trusted third party chooses and publishes the prime number \( p = 199 \) and the generator \( g = 127 \) of the group \( \mathbb{Z}_{199}^{*} \).

Alice chooses the secret value \( a = 190 \) and computes her public value \( A = 127^{190} \: mod \: 199 = 4 \). Similarly, Bob chooses the secret value \( b = 14 \) and computes his public value \( B = 127^{14} \: mod \: 199 = 51 \). Alice then sends \( A = 4 \) to Bob and Bob sends \( B = 51 \) to Alice.

Alice computes the shared secret key by \( A' = 51^{190} \: mod \: 199 = 177 \) and Bob computes the shared secret key by \( B' = 4^{14} \: mod \: 199 = 177 \). Alice and Bob can now use the key \( 177 \) in a symmetric key algorithm such as DES or AES.

While our example uses small values \( p = 199 \) for clarity, real-world implementations typically use prime moduli of 2048 or 4096 bits to ensure security against modern computational capabilities.