In cryptography, we often encrypt and sign messages that contain characters. However, asymmetric key cryptosystems (where Alice and Bob use different keys), such as RSA and ElGamal, are based on arithmetic operations on integers. Symmetric key cryptosystems (where Alice and Bob use the same key), such as DES and AES, are based on bitwise operations on bits (a bit is either 0 or 1 and is an abbreviation for binary digit). Therefore, we must convert characters into integers or bits before applying the cryptosystem to the message.
Because a computer can only handle bits, there already exists a method for converting characters into integers or bits, which uses the ASCII (American Standard Code for Information Interchange) table. A part of the ASCII table from Wikipedia is shown in the following table (note that the binary representation on the Wikipedia site uses only seven bits, but we use eight bits by prepending an extra 0):
Character | Decimal | Hexadecimal | Binary |
---|---|---|---|
(space) | 32 | 20 | 00100000 |
! | 33 | 21 | 00100001 |
\( \vdots \) | \( \vdots \) | \( \vdots \) | \( \vdots \) |
A | 65 | 41 | 01000001 |
B | 66 | 42 | 01000010 |
\( \vdots \) | \( \vdots \) | \( \vdots \) | \( \vdots \) |
a | 97 | 61 | 01100001 |
b | 98 | 62 | 01100010 |
\( \vdots \) | \( \vdots \) | \( \vdots \) | \( \vdots \) |
For example, if Alice wants to send the message "Hey Bob!" encrypted to Bob using a symmetric key cryptosystem, she first converts it into its integer representation and then into its binary representation:
\( \eqalign{ H &\rightarrow 72 &&\rightarrow 01001000 \\ e &\rightarrow 101 &&\rightarrow 01100101 \\ y &\rightarrow 121 &&\rightarrow 01111001 \\ &\rightarrow 32 &&\rightarrow 00100000 \\ B &\rightarrow 66 &&\rightarrow 01000010 \\ o &\rightarrow 111 &&\rightarrow 01101111 \\ b &\rightarrow 98 &&\rightarrow 01100010 \\ ! &\rightarrow 33 &&\rightarrow 00100001 } \)
That is, the message "Hey Bob!" is represented with integers as "72 101 121 32 66 111 98 33" and in binary as "01001000 01100101 01111001 00100000 01000010 01101111 01100010 00100001".
While ASCII is sufficient for basic English text, modern cryptographic applications often need to handle international characters. Unicode standards like UTF-8 extend ASCII and can encode virtually all characters from all writing systems. In UTF-8, ASCII characters still use a single byte, while other characters may use 2-4 bytes.
Modern cryptographic algorithms typically operate on bytes (8-bit units) or words (groups of bytes). The binary representation shown above organizes characters into bytes, which serve as the fundamental unit for cryptographic operations in algorithms like AES, where data is processed in 128-bit (16-byte) blocks.
Notice the relationship between the different representations:
It's important to note that encoding (converting characters to numbers or binary) is not the same as encryption. Encoding is a standardized, reversible process with no secret key. Anyone knowing the encoding scheme can decode the message. Encryption, on the other hand, requires a key to reverse the process, providing actual security.
Several other encoding schemes are commonly used in cryptographic applications:
The exponent of a number indicates how many times the number is multiplied by itself. For example:
\( 4^{3} = 4 \cdot 4 \cdot 4 = 64 \)
Here, 3 is the exponent (or power), and 4 is the base. In words, we say "4 to the power of 3."
Here are the laws of exponents:
Law | Example |
---|---|
\( x^{1} = x \) | \( 3^{1} = 3 \) |
\( x^{0} = 1 \) | \( 4^{0} = 1 \) |
\( x^{-1} = \frac{1}{x} \) | \( 5^{-1} = \frac{1}{5} \) |
\( x^{m} \cdot x^{n} = x^{m+n} \) | \( x^{2} \cdot x^{3} = (x \cdot x) \cdot (x \cdot x \cdot x) = x \cdot x \cdot x \cdot x \cdot x = x^{5} = x^{2+3} \) |
\( \frac{x^{m}}{x^{n}} = x^{m-n} \) | \( \frac{x^{4}}{x^{2}} = \frac{x \cdot x \cdot x \cdot x}{x \cdot x} = x \cdot x \cdot \frac{x \cdot x}{x \cdot x} = x \cdot x \cdot 1 = x^{2} = x^{4-2} \) |
\( (x^{m})^{n} = x^{m \cdot n} \) | \( (x^{2})^{3} = (x \cdot x) \cdot (x \cdot x) \cdot (x \cdot x) = x \cdot x \cdot x \cdot x \cdot x \cdot x = x^{6} = x^{2 \cdot 3} \) |
\( (x \cdot y)^{n} = x^{n} \cdot y^{n} \) | \( (x \cdot y)^{2} = (x \cdot y) \cdot (x \cdot y) = x \cdot x \cdot y \cdot y = x^{2} \cdot y^{2} \) |
\( (\frac{x}{y})^{n} = \frac{x^{n}}{y^{n}} \) | \( (\frac{x}{y})^{3} = (\frac{x}{y}) \cdot (\frac{x}{y}) \cdot (\frac{x}{y}) = \frac{x \cdot x \cdot x}{y \cdot y \cdot y} = \frac{x^{3}}{y^{3}} \) |
\( x^{-n} = \frac{1}{x^{n}} \) | \( x^{-3} = (x^{-1})^{3} = (\frac{1}{x})^{3} = \frac{1}{x} \cdot \frac{1}{x} \cdot \frac{1}{x} = \frac{1 \cdot 1 \cdot 1}{x \cdot x \cdot x} = \frac{1}{x^{3}} \) |
\( x^{\frac{1}{n}} = \sqrt[n]{x} \) | \( 9^{\frac{1}{2}} = \sqrt{9} = 3 \) |
\( x^{\frac{m}{n}} = \sqrt[n]{x^m} = (\sqrt[n]{x})^m \) | \( 8^{\frac{2}{3}} = \sqrt[3]{8^2} = \sqrt[3]{64} = 4 \) |
In cryptography, we often compute exponents in modular arithmetic:
\( a^b \mod n \) means raising \(a\) to power \(b\) and then taking the remainder when divided by \(n\)
Example: \( 3^4 \mod 7 = 81 \mod 7 = 4 \)
The same exponent rules apply in modular arithmetic:
\( (a^m \cdot a^n) \mod n = a^{m+n} \mod n \)
\( (a^m)^n \mod p = a^{m \cdot n} \mod p \)
These exponent rules are fundamental to many cryptographic algorithms:
For cryptographic applications, we need to efficiently compute large exponents. The square-and-multiply algorithm lets us calculate \(x^n\) with only \(O(\log n)\) multiplications:
Example: To compute \(3^{13}\), we use binary 13 = 1101:
result = 1
bit 1: result = \( 1^{2} \times 3 = 3 \)
bit 1: result = \( 3^{2} \times 3 = 27 \)
bit 0: result = \( 27^{2} = 729 \)
bit 1: result = \( 729^{2} \times 3 = 1,594,323 \)
You already use modulo computation when you look at a clock and, for example, need to figure out what time it will be 3 hours after 11 o'clock, which is 2 o'clock. In mathematics, we write this as:
\( (11 + 3) \: mod \: 12 = 2 \)
Here, 12 is the modulus because we want the time as an integer between 0 and 11 (in this case, 12 o'clock is denoted by 0). In words, we say "11 plus 3 modulo 12 equals 2." The result of a modulo computation is an integer between 0 and the modulus minus 1. For example, with modulus 3, we have:
For example, if we look at \( 27 \: mod \: 5 \), modulo computes how many times 5 divides into 27 and then returns the remainder, which is 2 in this case. That is, \( 27 \: mod \: 5 = 2 \). But how do we get this result?
First, we compute how many times we can multiply 5 by an integer \( x \) so that the result is as close as possible to 27 without exceeding it. In other words, we find the maximum value of \( x \) such that \( 5 \cdot x \leq 27 \). In this case, \( x = 5 \) because \( 5 \cdot 5 = 25 \leq 27 \). Then, by subtracting 25 from 27, we get the answer: \( 27 - 25 = 2 \).
If the integer is negative, for example \( -27 \: mod \: 5 \), we have to do it slightly differently, and the answer is \( -27 \: mod \: 5 = 3 \). In this case, the integer \( x \) is negative and should be the closest integer such that \( 5 \cdot x \) does not exceed -27. That is, we find the minimum value of \( -x \) such that \( 5 \cdot -x \geq -27 \). Here, \( -x = -6 \) because \( 5 \cdot -6 = -30 \geq -27 \). Then, by subtracting -30 from -27, we get the answer: \( -27 - (-30) = -27 + 30 = 3 \).
It is important that \( x \) or \( -x \) is an integer such as \( -14, 3, 17 \), etc., and NOT a fraction or decimal such as \( \frac{1}{4}, \frac{-3}{7}, 2.5, 5.1 \), etc.
If two integers \( a \) and \( b \) modulo the same modulus \( c \) return the same remainder \( r \), then we say that \( a \) and \( b \) are congruent modulo \( c \). That is, if \( a \: mod \: c = r \) and \( b \: mod \: c = r \), then \( a \equiv b \: (mod \: c) \). Also, notice that if the modulus \( c \) is greater than the integer \( a \), i.e., \( c > a \), the result will always be \( a \: mod \: c = a \).
Modular arithmetic has several important properties that make it useful in cryptography:
These properties allow us to perform calculations on smaller values rather than potentially very large ones, which is essential in cryptographic algorithms.
A modular inverse of an integer \(a\) with respect to modulus \(n\) is an integer \(a^{-1}\) such that:
\(a \cdot a^{-1} \equiv 1 \: (\text{mod} \: n)\)
The modular inverse exists if and only if \(a\) and \(n\) are coprime (i.e., their greatest common divisor is 1). Finding modular inverses is crucial in cryptography, particularly in algorithms like RSA where we need to find \(d\) such that \(e \cdot d \equiv 1 \: (\text{mod} \: \phi(n))\).
Fermat's Little Theorem: If \(p\) is a prime number and \(a\) is not divisible by \(p\), then \(a^{p-1} \equiv 1 \: (\text{mod} \: p)\).
Euler's Theorem: If \(a\) and \(n\) are coprime, then \(a^{\phi(n)} \equiv 1 \: (\text{mod} \: n)\), where \(\phi(n)\) is Euler's totient function.
These theorems are fundamental to understanding and implementing many cryptographic algorithms, especially RSA.
Modular arithmetic is fundamental to many cryptographic algorithms:
A prime number is an integer greater than 1 that can only be divided evenly by 1 and itself. "Divided evenly" means the result is an integer, not a float. For example, if you divide 13 by 3, you get the float \( \frac{13}{3} = 4.333 \). We see that 13 is a prime number because it can only be divided evenly by 1 and itself: \( \frac{13}{1} = 13 \) and \( \frac{13}{13} = 1 \).
If an integer is not a prime number, it is called a composite number. For example, the integer 6 is a composite number because it can be divided evenly by 1, 2, 3, and 6:
\( \frac{6}{1} = 6 \), \( \frac{6}{2} = 3 \), \( \frac{6}{3} = 2 \), and \( \frac{6}{6} = 1 \)
The term "composite" means "something made by combining things." So, a composite number is an integer made by multiplying prime numbers:
Therefore, every integer greater than 1 is either a prime number or a product of prime numbers (a composite number).
The famous Greek mathematician Euclid proved that there are infinitely many prime numbers.
Michael Rabin and Gary Miller developed an algorithm that determines whether an integer is a prime or a composite number by testing the integer with multiple bases, denoted by \( a \). The algorithm is called the Rabin-Miller primality test.
Every integer greater than 1 can be expressed as a unique product of prime numbers (up to the order of the factors). This fundamental property makes prime numbers the "building blocks" of all natural numbers and is crucial for many cryptographic applications.
Prime numbers form the foundation of many cryptographic systems:
Determining whether a number is prime is crucial in cryptography. Several methods exist:
The Miller-Rabin test is particularly useful because it's efficient and can achieve any desired level of certainty by increasing the number of test rounds.
The greatest common divisor (GCD) of two or more integers is the largest positive integer that divides each of the given integers without a remainder. It is denoted as \(\gcd(a, b)\) or sometimes as \((a, b)\).
Formally, for two integers \(a\) and \(b\), not both zero, \(\gcd(a,b)\) is the largest positive integer \(d\) such that \(d|a\) and \(d|b\) (where \(d|a\) means "\(d\) divides \(a\)" or "\(a\) is divisible by \(d\)").
Before we describe what the greatest common divisor of two integers is, we first define what we mean by a divisor. In this context, a divisor of an integer \( x \) is any integer that divides \( x \) evenly, meaning the result is not a fraction. For example, if you divide \( x=12 \) by 5, you get the fraction \( \frac{12}{5} = 2.4 \), so 5 is not a divisor of \( x=12 \). For \( x=12 \), the divisors are 1, 2, 3, 4, 6, and 12 because \( \frac{12}{1} = 12 \), \( \frac{12}{2} = 6 \), \( \frac{12}{3} = 4 \), \( \frac{12}{4} = 3 \), \( \frac{12}{6} = 2 \), and \( \frac{12}{12} = 1 \).
Similarly, the divisors of 16 are 1, 2, 4, 8, and 16 because \( \frac{16}{1} = 16 \), \( \frac{16}{2} = 8 \), \( \frac{16}{4} = 4 \), \( \frac{16}{8} = 2 \), and \( \frac{16}{16}=1 \).
The greatest common divisor of 12 and 16 is therefore 4, because it is the largest integer among their common divisors. In mathematics, we write this as \( \gcd(12, 16) = 4 \).
Two integers whose greatest common divisor is 1 are called relatively prime numbers or co-primes. For example, 15 and 28 are relatively prime because \( \gcd(15, 28) = 1 \) (note that 28 is not a prime number).
If one of the two integers is a prime number, the greatest common divisor will always be 1, i.e., \( \gcd(a, p) = 1 \), where \( a \) is any integer (either prime or composite) and \( p \) is a prime number.
One method to compute the greatest common divisor of two integers is by using the Euclidean algorithm, developed by the famous Greek mathematician Euclid. See "The extended Euclidean algorithm" for more information about how to compute the greatest common divisor of two integers.
The Euclidean algorithm is an efficient method to find the GCD of two integers. It works by repeatedly applying the division algorithm:
Example: Find \(\gcd(48, 18)\)
Since the remainder is now 0, \(\gcd(48, 18) = 6\).
The GCD has several important applications in cryptography:
The extended Euclidean algorithm is an extension of the Euclidean algorithm, which only returns the greatest common divisor of two integers. Given two integers \( a \) and \( b \), the extended Euclidean algorithm returns the integers \( a \), \( b \), \( \lambda \), and \( \mu \) such that:
\( a \cdot \lambda + b \cdot \mu = \gcd(a, b) \)
Here, \( \lambda \) and \( \mu \) are called the Bézout coefficients for \( a \) and \( b \). Only if \( a \) and \( b \) are relatively prime, i.e., \( \gcd(a, b) = 1 \), then:
\( a \cdot \lambda + b \cdot \mu = 1 \)
In this case, \( \lambda \; mod \; b \) is the inverse of \( a \), denoted \( a^{-1} = \lambda \; mod \; b \), and \( \mu \: mod \: a \) is the inverse of \( b \), denoted \( b^{-1} = \mu \: mod \: a \) (see "Modulo computation" for more information about the \( mod \) operator). One useful property of an integer and its inverse is that \( a \cdot a^{-1} \; mod \; b = 1 \) and \( b \cdot b^{-1} \; mod \; a = 1 \).
You can easily compute \( \gcd(a, b) \), \( \lambda \), and \( \mu \) for example with \( a=5 \) and \( b=39 \) using a simple table. First, let us create a table with three columns (we do not yet know how many rows there will be). Let us denote the entry in the first row and first column as [1,1], the entry in the first row and second column as [1,2], the entry in the second row and first column as [2,1], and so on.
Next, we write \( b=39 \) in entry [1,1] and \( a=5 \) in entry [2,1]. Then we try to find the largest integer \( q_{1} \) such that \( q_{1} \cdot a \leq b \). We have \( q_{1}=7 \), which we write in entry [2,2], because \( 7 \cdot 5 = 35 \leq 39 \), and a remainder of \( r_{1}=4 \), which we write in entry [3,1].
Again, we try to find the largest integer \( q_{2} \) such that \( q_{2} \cdot r_{1} \leq a \). We have \( q_{2}=1 \), which we write in entry [3,2], because \( 1 \cdot 4 = 4 \leq 5 \), and a remainder of \( r_{2}=1 \), which we write in entry [4,1]. Notice that we are repeating the same process as before, just with the numbers in the next row.
The next computation returns a remainder of \( r_{3} = 0 \) because \( q_{3} \cdot r_{2} = 4 \cdot 1 = 4 \leq 4 = r_{1} \). We have now computed \( \gcd(5, 39)=r_{2}=1 \) since \( r_{3} = 0 \). Because 5 and 39 are relatively prime, we know that \( \lambda \) and \( \mu \) exist, and we can start using the last column.
First, we write \( x_{1}=0 \) in entry [4,3] and \( x_{2}=1 \) in entry [3,3]. Then we write \( x_{3}=q_{2} \cdot x_{2} + x_{1} = 1 \cdot 1 + 0 = 1 \) in entry [2,3]. For entry [1,3], we compute as before, just with numbers from the row above, i.e., \( x_{4}=q_{1} \cdot x_{3} + x_{2} = 7 \cdot 1 + 1 = 8 \).
Finally, we have that \( a \cdot x_{4} \pm b \cdot x_{3} = r_{2} \), where we need to decide whether it should be plus or minus between the two terms. Because \( a \cdot x_{4} = 5 \cdot 8 = 40 \), \( b \cdot x_{3} = 39 \cdot 1 \), and \( 40 \geq 39 \), we have \( 5 \cdot 8 - 39 \cdot 1 = 1 \) (which is the same as \( 5 \cdot 8 + 39 \cdot (-1) = 1 \)), so the Bézout coefficients are \( \lambda=8 \) and \( \mu=-1 \). Notice that \( a^{-1} = \lambda \; mod \; b = 8 \; mod \; 39 = 8\) and \( b^{-1} = \mu \; mod \; a = -1 \: mod \: 5 = 4\), where \( a \cdot a^{-1} \; mod \; b = 5 \cdot 8 \; mod \; 39 = 1 \) and \( b \cdot b^{-1} \; mod \; a = 39 \cdot 4 \; mod \; 5 = 1 \).
The table for computing \( 5 \cdot \lambda + 39 \cdot \mu = \gcd(5, 39) \) is:
\( b=39 \) | \( x_{4}=8 \) | |
\( a=5 \) | \( q_{1}=7 \) | \( x_{3}=1 \) |
\( r_{1}=4 \) | \( q_{2}=1 \) | \( x_{2}=1 \) |
\( r_{2}=1 \) | \( q_{3}=4 \) | \( x_{1}=0 \) |
\( r_{3}=0 \) |
function extended_gcd(a, b) if b = 0 then return (a, 1, 0) else (d, x', y') = extended_gcd(b, a mod b) return (d, y', x' - (a div b) * y')
The function returns a tuple (d, x, y) such that d = gcd(a, b) and d = ax + by.
While the standard Euclidean algorithm computes only the GCD through successive divisions, the extended version keeps track of the coefficients at each step. This "extension" allows us to express the GCD as a linear combination of the original numbers, which is crucial for finding modular inverses.
The extended Euclidean algorithm has the same time complexity as the standard Euclidean algorithm: O(log min(a,b)). This makes it highly efficient even for very large integers, which is crucial for its application in cryptographic systems.
In abstract algebra, a ring is a set with two operations (usually addition and multiplication) where addition forms an abelian group and multiplication is associative and distributes over addition. A group is a set with a single operation that satisfies closure, associativity, identity, and inverse properties.
The set of integers \( \{ \dots, -2, -1, 0, 1, 2, \dots \} \) is denoted by the symbol \( \mathbb{Z} \), i.e., \( \mathbb{Z} = \{ \dots, -2, -1, 0, 1, 2, \dots \} \), and is called the ring of integers or an additive group. The group of integers modulo \( n \) is a subset of \( \mathbb{Z} \) and is denoted by \( \mathbb{Z}/n\mathbb{Z} \), but we use the shorthand \( \mathbb{Z}_{n} \). This subset contains the following elements (because we compute modulo \( n \)):
\( \mathbb{Z}_{n} = \{ 0, 1, 2, \dots, n - 1 \}\)
Notice that whenever we perform addition or multiplication in \( \mathbb{Z}_{n} \), we always compute the result modulo \( n \) to obtain an integer in \( \mathbb{Z}_{n} \). To illustrate this, let us look at \( n = 5 \):
\( \mathbb{Z}_{5} = \{ 0, 1, 2, 3, 4 \}\)
When adding \( 3 \) and \( 4 \) in \( \mathbb{Z}_{5} \), we do the following: \( (3 + 4) \: mod \: 5 = 7 \: mod \: 5 = 2 \). Likewise, when multiplying \( 3 \) by \( 4 \) in \( \mathbb{Z}_{5} \), we have: \( (3 \cdot 4) \: mod \: 5 = 12 \: mod \: 5 = 2 \).
An integer \( a \) in \( \mathbb{Z}_{n} \) has an inverse if the greatest common divisor of \( a \) and \( n \) is 1, i.e., \( \gcd(a, n) = 1 \) (see "The extended Euclidean algorithm" for more information). The integer \( a \) is then called a unit. The set of units (all integers with an inverse in \( \mathbb{Z}_{n} \)) is denoted by \( (\mathbb{Z}/n\mathbb{Z})^{*} \). Again, we use the shorthand \( \mathbb{Z}_{n}^{*} \). If \( a_{1} \) and \( a_{2} \) are units, then their product \( (a_{1} \cdot a_{2}) \: mod \: n \) is also always a unit (i.e., \( a_{1} \cdot a_{2} \) is in the group \( \mathbb{Z}_{n}^{*} \)), but the sum \( (a_{1} + a_{2}) \: mod \: n \) may NOT be a unit (i.e., \( a_{1} + a_{2} \) is in \( \mathbb{Z}_{n} \) but may not be in \( \mathbb{Z}_{n}^{*} \)). We see the difference in the two sets \( \mathbb{Z}_{8} \) and \( \mathbb{Z}_{8}^{*} \):
If we choose \( n \) to be a prime number \( p \), then for all integers \( a \) except 0 in \( \mathbb{Z}_{p} \), we have \( \gcd(a, p) = 1 \), which means that \( \mathbb{Z}_{p}^{*} \) contains all integers from \( \mathbb{Z}_{p} \) except 0, i.e.:
\( \mathbb{Z}_{p}^{*} = \{ 1, 2, 3, \dots, p - 1 \}\)
For example, for \( p=7 \), the only difference between the two sets \( \mathbb{Z}_{7} \) and \( \mathbb{Z}_{7}^{*} \) is the integer 0:
The number of elements in \( \mathbb{Z}_{n}^{*} \) is denoted by \( \phi(n) \), named after the famous Swiss mathematician Euler, and is called Euler's phi function. As we saw previously, if \( n \) is a prime number \( p \), then \( \phi(p) = p-1 \). The number of elements in a group \( G \) is also called the order of \( G \) and is written as \( \left| G \right| \). The order of:
Euler's phi function \(\phi(n)\) can be computed efficiently when the prime factorization of \(n\) is known:
For a prime power \(p^k\): \(\phi(p^k) = p^k - p^{k-1} = p^k(1 - \frac{1}{p})\)
For a product of prime powers: \(\phi(n) = n \prod_{p|n} (1 - \frac{1}{p})\)
For example, \(\phi(36) = \phi(2^2 \cdot 3^2) = 36 \cdot (1 - \frac{1}{2}) \cdot (1 - \frac{1}{3}) = 36 \cdot \frac{1}{2} \cdot \frac{2}{3} = 12\)
If, for example, we choose the elements \( x \), \( a \), and \( b \) from the group \( \mathbb{Z}_{p}^{*} \) and want to compute \( x^{a + b} \: mod \: p \), then in the exponent we actually compute \( a + b \) modulo the order of the group, i.e., \( a + b \: mod \: (p-1) \) because \( \left| \mathbb{Z}_{p}^{*} \right| = \phi(p) = p-1 \). So, what we actually compute is \( x^{a + b \: mod \: (p-1)} \: mod \: p \). The same is true if we had chosen one of the other groups: we always compute modulo the order of the group in the exponent. Therefore, next time you look at an equation from a cryptosystem and wonder why they suddenly compute, for example, modulo \( p-1 \) instead of modulo \( p \), it is because the equation is used in the exponent of some integer.
Units are particularly important in cryptography because they have multiplicative inverses, allowing operations like division in modular arithmetic. The existence of these inverses is crucial for many cryptographic algorithms, including RSA, where we need to find values \(e\) and \(d\) such that \(e \cdot d \equiv 1 \pmod{\phi(n)}\).
Fermat's Little Theorem: If \(p\) is a prime number and \(a\) is not divisible by \(p\), then \(a^{p-1} \equiv 1 \pmod{p}\).
Euler's Theorem: For any coprime integers \(a\) and \(n\), \(a^{\phi(n)} \equiv 1 \pmod{n}\). This generalizes Fermat's Little Theorem and forms the mathematical foundation of the RSA cryptosystem.
Let \(m_1, m_2, \ldots, m_k\) be pairwise coprime positive integers (meaning \(\gcd(m_i, m_j) = 1\) for all \(i \neq j\)). Then for any given integers \(a_1, a_2, \ldots, a_k\), there exists a unique solution \(x\) modulo \(M = m_1 \cdot m_2 \cdot \ldots \cdot m_k\) to the system of congruences:
\begin{align} x &\equiv a_1 \pmod{m_1} \ x &\equiv a_2 \pmod{m_2} \ &\vdots \ x &\equiv a_k \pmod{m_k} \end{align}
Consider the system:
To solve this using CRT:
Therefore, \(x = 23\) is the solution to the system.
The Chinese Remainder Theorem has several important applications in cryptography:
Let \( n \) be the product of two large prime numbers \( p \) and \( q \), i.e. \( n = p \cdot q \). Then, if you know the value of \( n \) it's hard to compute \( p \) and \( q \) from \( n \).
The problem of computing \( p \) and \( q \) is called the prime factorization problem.
What makes the prime factorization problem difficult is that while multiplication is easy (computing \(n = p \cdot q\) is straightforward), the reverse operation of factoring has no known efficient classical algorithm. The computational complexity grows exponentially with the number of bits in \(n\).
For small numbers, factorization is trivial: If \(n = 15\), we can easily find \(p = 3\) and \(q = 5\).
However, when the primes are large (hundreds of digits), factoring becomes practically impossible with current classical computing methods. For example, factoring a 2048-bit RSA modulus would take billions of years using the best-known algorithms.
The security of the RSA cryptosystem fundamentally depends on the hardness of the prime factorization problem:
In 1994, Peter Shor developed a quantum algorithm that can factor integers in polynomial time. If large-scale quantum computers become available, they could break RSA encryption by efficiently solving the prime factorization problem. This threat has motivated research into post-quantum cryptography algorithms that aren't vulnerable to quantum attacks.
As of 2023, the largest RSA number factored using general methods is RSA-250 (829 bits), which was factored in February 2020. Current cryptographic recommendations suggest using RSA keys of at least 2048 bits for security through the next decade.
The security of an asymmetric (public-key) cryptosystem, such as RSA or ElGamal, is measured with respect to chosen plaintext attacks (CPA) and chosen ciphertext attacks (CCA).
In a chosen plaintext attack (sometimes called a semantic attack), Alice and Bob's adversary, Eve, is passive; she only observes the ciphertexts sent between Alice and Bob and tries to guess the plaintexts. An asymmetric cryptosystem is considered CPA-secure if Eve's advantage in correctly guessing the plaintext in the following game is negligible:
Eve sends a message \( m \) to an oracle, which returns a ciphertext \( c \) that is either an encryption of \( m \) or a random message of the same length as \( m \). Eve then tries to guess what \( c \) is an encryption of.
Therefore, if an asymmetric cryptosystem is CPA-secure, the only information that a ciphertext reveals is the length of the encrypted message.
Formally, Eve's "advantage" in these security games is defined as the absolute difference between 1/2 and the probability that Eve correctly guesses whether she received an encryption of her chosen message: |Pr[Eve wins] - 1/2|. We say this advantage is "negligible" if it decreases faster than any polynomial in the security parameter (typically the key length).
In a chosen ciphertext attack, Eve is active, making this a stronger attack compared to a chosen plaintext attack. Now, Eve can have ciphertexts decrypted by the oracle. An asymmetric cryptosystem is considered CCA-secure if Eve's advantage in correctly guessing the plaintext in the following game is negligible:
Eve sends a ciphertext \( c_{i} \) to an oracle, which returns the decrypted message \( m_{i} \) of \( c_{i} \). This can be repeated as many times as Eve wants. Then, Eve sends a message \( m \) to the oracle, which returns a ciphertext \( c \) that is either an encryption of \( m \) or a random message of the same length as \( m \). Finally, Eve sends a ciphertext \( c_{i}' \), which must be different from \( c \), to the oracle, and it returns the decrypted message \( m_{i}' \) of \( c_{i}' \). This can also be repeated as many times as Eve wants. Eve then tries to guess what \( c \) is an encryption of.
Again, if an asymmetric cryptosystem is CCA-secure, the only information that a ciphertext reveals is the length of the encrypted message.
In cryptographic security models, an "oracle" refers to a theoretical entity that performs certain cryptographic operations (like encryption or decryption) when queried. It represents the capability an attacker might have to interact with the cryptosystem.
Cryptographers further distinguish between two levels of CCA security:
CCA2 security is considered the gold standard for modern asymmetric encryption schemes.
The distinction between CPA and CCA security matters significantly in practice:
The security of digital signatures, such as RSA and ElGamal, is typically measured using the concept of Existential Unforgeability under Chosen Message Attack (EUF-CMA), which is analogous to CPA security for encryption schemes.
In a chosen plaintext attack, Alice and Bob's adversary, Eve, is passive; she only observes the messages and signatures exchanged between Alice and Bob and tries to forge a signature. A digital signature scheme is considered CPA-secure if Eve's advantage in forging a signature in the following game is negligible:
Eve sends a message \( m_{i} \) to an oracle (which represents the legitimate signer's capabilities), which returns its signature \( \sigma_{i} \). This process can be repeated as many times as Eve wishes. Finally, Eve outputs a message \( m \), which must be different from all previously queried messages \( m_{i} \), along with a signature \( \sigma \). Eve wins the game if \( \sigma \) is a valid signature for the message \( m \).
Cryptographers classify forgeries into several types:
Secure digital signatures are critical in many applications:
If a signature scheme is vulnerable to forgery attacks, these systems could be compromised.
Most secure signature schemes operate on hashed messages rather than raw messages. This serves two purposes:
The security of the signature scheme depends heavily on the security of the underlying hash function.
The RSA cryptosystem is named after its inventors Ron Rivest, Adi Shamir, and Leonard Adleman, who first described the algorithm in 1977. RSA is a public key cryptosystem based on the prime factorization problem. In this system, each person has a key pair \( (sk, pk) \), where \( sk \) is the secret key and \( pk \) is the public key. Given only the public key, one must solve the prime factorization problem to find the secret key.
Feature | RSA | Symmetric Encryption (e.g., AES) |
---|---|---|
Key Type | Public and private key pairs | Single shared secret key |
Encryption Speed | Slower | Faster |
Key Distribution | Public keys can be distributed openly | Requires secure channel for key exchange |
Security Basis | Prime factorization problem | Confusion and diffusion principles |
RSA serves as both an encryption scheme (explained in this section), which allows Alice and Bob to exchange sensitive information over an insecure channel eavesdropped by their adversary Eve, and as a digital signature scheme (explained in the next section), which enables them to create digital signatures.
In the encryption scheme, Alice first chooses two large secret prime numbers \( p \) and \( q \), and computes the product \( n = p \cdot q \). She then chooses a public encryption exponent \( e \) such that \( 3 \leq e \leq n-1 \) and \( \gcd(e, \phi(n)) = 1 \), where \( \phi(n) = (p-1) \cdot (q-1) \) is called Euler's phi function. The reason the public encryption exponent \( e \) must be relatively prime to \( \phi(n) \) (i.e., \( \gcd(e, \phi(n)) = 1 \)) is because Alice needs the corresponding secret decryption exponent \( d \), which is the inverse of \( e \). In the RSA cryptosystem, \( e \) only has an inverse when \( \gcd(e, \phi(n)) = 1 \).
Alice then publishes the public key \( pk = (n, e) \) and stores the secret key \( sk = (n, d) \). Notice that both Alice's friend Bob and their adversary Eve know the public key, but to compute the secret decryption exponent \( d \), they must find the prime factors \( p \) and \( q \) of \( n \) (i.e., solve the prime factorization problem), because \( d \) is uniquely determined by \( e \), \( p \), and \( q \) (as shown later).
RSA is also called a one-way trapdoor function because it is easy to compute the ciphertext \( c \) from the plaintext \( m \), but hard to do the reverse. However, with special information (the "trapdoor" information), which in this case is knowledge of the two prime numbers \( p \) and \( q \), it is easy to compute the plaintext \( m \) from the ciphertext \( c \).
Suppose Bob has a secret message he wants to send to Alice and wishes to hide the content from Eve. First, Bob converts the message into an integer \( m \) such that \( 1 \leq m \leq n-1 \), where \( m \) is called the plaintext. He then encrypts \( m \) using the RSA encryption algorithm \( E \) with the public key \( pk = (n, e) \) by computing \( c = E_{pk}(m) = m^{e} \: mod \: n \), where \( c \) is called the ciphertext, which he sends to Alice.
We have not yet discussed how Alice computes her secret decryption exponent \( d \) for the secret key \( sk = (n, d) \): Recall that Alice chooses the two prime numbers \( p \) and \( q \), so she can compute Euler's phi function \( \phi(n) = (p-1) \cdot (q-1) \). She then uses the extended Euclidean algorithm to compute \( d \). The extended Euclidean algorithm gives her the equation \( e \cdot \lambda + \phi(n) \cdot \mu = \gcd(e, \phi(n)) \), where \( d = \lambda \; mod \; \phi(n) \) is called the inverse of \( e \) because \( e \cdot d \: mod \: \phi(n) = 1 \) (remember that \( d \) only exists because \( \gcd(e, \phi(n)) = 1 \)). Thus, \( d \) is uniquely determined by \( e \), \( p \), and \( q \).
Now it is easy for Alice to decrypt the received ciphertext \( c \) using the RSA decryption algorithm \( D \) and the secret key \( sk = (n, d) \) by computing \( m' = D_{sk}(c) = c^{d} \: mod \: n \), where \( m' \) is equal to the original plaintext \( m \). We will prove this in the following: As described above, \( e \cdot d \: mod \: \phi(n) = 1 \) because \( d \) is the inverse of \( e \). This is equivalent to \( e \cdot d = 1 + k \cdot \phi(n) \) for some integer \( k \geq 1 \), and hence:
\( \eqalign{ m' &= c^{d} \: mod \: n &&(c = m^{e}) \\ &= (m^{e})^{d} \: mod \: n &&(\mbox{exponent rule}) \\ &= m^{e \cdot d} \: mod \: n &&(e \cdot d \: mod \: \phi(n) = 1+k \cdot \phi(n))\\&= m^{1+k \cdot \phi(n)} \: mod \: n &&(\mbox{see below})\\ &= m } \)
The step \( m^{1 + k \cdot \phi(n)} \: mod \: n = m \: mod \: n \) may need some explanation. First, notice that instead of solving modulo \( n \), we can solve modulo \( p \) and modulo \( q \) according to the Chinese Remainder Theorem, because \( n = p \cdot q \) (\( n \) is a product of prime numbers) and \( \gcd(p, q) = 1 \) (the factors \( p \) and \( q \) are relatively prime to each other). That is, we have to solve the two equations \( m^{1+k \cdot \phi(n)} \: mod \: p = m \: mod \: p \) and \( m^{1+k \cdot \phi(n)} \: mod \: q = m \: mod \: q \), where the proof for both cases is the same. Therefore, we only need to prove that \( m^{1+k \cdot \phi(n)} \: mod \: p = m \: mod \: p \). We have two cases: either \( m \: mod \: p = 0 \) (\( p \) divides \( m \)), or \( m \: mod \: p \neq 0 \). If \( m \: mod \: p = 0 \), then:
\( \eqalign{ m^{1+k \cdot \phi(n)} \: mod \: p &= m \: mod \: p &&(m \: mod \: p = 0) \\ 0^{1+k \cdot \phi(n)} \: mod \: p &= 0 \: mod \: p \\ 0 \: mod \: p &= 0 \: mod \: p } \)
Otherwise, if \( m \: mod \: p \neq 0 \), then \( \gcd(m, p) = 1 \), and Euler's Theorem tells us that \( m^{\phi(p)} \: mod \: p = 1 \) for all \( 1 \leq m \leq p-1 \), where \( \phi(p) = p-1 \):
\( \eqalign{ m^{1+k \cdot \phi(n)} \: mod \: p &= m \: mod \: p &&(\phi(n) = (p-1) \cdot (q-1) \\ m^{1+k \cdot (p-1) \cdot (q-1)} \: mod \: p &= m \: mod \: p &&(\phi(p) = (p-1) \\ m^{1+k \cdot \phi(p) \cdot (q-1)} \: mod \: p &= m \: mod \: p &&(\mbox{exponent rule}) \\ m^{1} \cdot m^{k \cdot \phi(p) \cdot (q-1)} \: mod \: p &= m \: mod \: p \\ m^{1} \cdot m^{\phi(p) \cdot k \cdot (q-1)} \: mod \: p &= m \: mod \: p &&(\mbox{exponent rule}) \\ m^{1} \cdot (m^{\phi(p)})^{k \cdot (q-1)} \: mod \: p &= m \: mod \: p &&(\mbox{Euler's Theorem}) \\ m^{1} \cdot 1^{k \cdot (q-1)} \: mod \: p &= m \: mod \: p \\ m \: mod \: p &= m \: mod \: p \\ } \)
The encryption algorithm \( E \) is most efficient when the public encryption exponent \( e \) is small, and similarly, the decryption algorithm \( D \) is most efficient when the secret decryption exponent \( d \) is small. However, because \( d \) is determined by the value of \( e \), Alice cannot choose both of them to be small, and \( e \) also has to be relatively prime to \( \phi(n) \), so it must be strictly greater than \( 2 \), i.e., \( e \geq 3 \). Therefore, the smallest possible value is \( e = 3 \). However, if Alice wants to send the same message \( m \) encrypted to, say, three different people, each with their own public key \( pk_{1} = (e, n_{1}) \), \( pk_{2} = (e, n_{2}) \), and \( pk_{3} = (e, n_{3}) \) where \( e = 3 \), Eve can easily compute the message \( m \) from the three different ciphertexts \( c_{1} = m^e \: mod \: n_{1} \), \( c_{2} = m^e \: mod \: n_{2} \), and \( c_{3} = m^e \: mod \: n_{3} \) by using the Chinese Remainder Theorem. Eve first computes a number \( t \) such that \( t = m^{3} \: mod \: (n_{1} \cdot n_{2} \cdot n_{3}) \), and then recovers the original message \( m \) by \( m = \sqrt[3]{t} \). Therefore, to be secure against this type of attack and still have fast encryption, the value \( e = 2^{16} + 1 = 65537 \) is often used as the encryption exponent, which is a prime number.
Basic "textbook" RSA has a significant security limitation: it's deterministic. This means encrypting the same message twice always produces the same ciphertext, which makes it vulnerable to several attacks:
For this reason, "textbook" RSA is neither CPA-secure nor CCA-secure. This is because Eve could first encrypt every possible message herself using the public key and then compare the results to the observed ciphertext. However, it is possible to use the "textbook" form of RSA to build a CPA- and CCA-secure version.
CCA-secure RSA
The CCA-secure version is called OAEP (Optimal Asymmetric Encryption Padding):
CPA-secure RSA
In the following, we will describe the CPA-secure version:
Alice first generates a key pair \( (sk, pk) \) as in the "textbook" version, sends the public key \( pk = (n,e) \) to Bob, and keeps the secret key \( sk = (n,d) \) private. One of the main differences between this version and the "textbook" version is that Bob can only send one encrypted bit to Alice; that is, the message is either \( b = 0 \) or \( b = 1 \). Bob encrypts the bit by first choosing a random integer \( m_{b} \) between \( 1 \) and \( n-1 \) such that the least significant bit of \( m_{b} \) is equal to \( b \), i.e., \( lsb(m_{b}) = b \). He then sends the ciphertext \( c = E_{pk}(m_{b}) \) to Alice, who decrypts it by computing \( m_{b} = D_{sk}(c) \) and then recovers \( b = lsb(m_{b}) \).
This CPA-secure version is, of course, not very efficient because it allows us to encrypt only one bit at a time. However, in practice, RSA is often used to encrypt keys for symmetric cryptosystems such as DES and AES, where the key is usually much shorter than \( n \). We can then place the key in the least significant end of the message to be encrypted, but since the keys are much shorter, we pad with random bits until we reach the required length. In the above CPA-secure version, the key was only one bit; however, it has been proven that it remains secure with keys of size up to \( \log_2(n) \) bits.
RSA is also a partially homomorphic cryptosystem because:
\( \eqalign{ E_{pk}(m_{1}) \cdot E_{pk}(m_{2}) \: mod \: n &= m_{1}^{e} \cdot m_{2}^{e} \: mod \: n &&(\mbox{exponent rule}) \\ &= (m_{1} \cdot m_{2})^{e} \: mod \: n \\ &= E_{pk}(m_{1} \cdot m_{2} \: mod \: n) } \)
In other words, multiplying two ciphertexts is equivalent to multiplying the two corresponding plaintexts. RSA is partially homomorphic, not fully homomorphic, because only multiplication has this property (and not addition).
This property is both an advantage and a disadvantage of the cryptosystem. For example, it is an advantage when Alice has some private data \( m_{1} \) on which she wants a cloud service to perform computations. She first computes \( c_{1} = E_{pk}(m_{1}) \) and then sends the ciphertext \( c_{1} \) to the cloud service. The cloud service can then perform computations on Alice's data by calculating \( c_{1} \cdot c_{2} \: mod \: n \) for some \( c_{2} = E_{pk}(m_{2}) \). Alice then receives and decrypts the resulting ciphertext, which returns the data \( m_{1} \cdot m_{2} \: mod \: n \). On the other hand, this property can be a disadvantage: for example, if Eve intercepts a ciphertext \( c_{1} \) sent from Alice to Bob and computes \( c_{1} \cdot c_{2} \: mod \: n \) for some \( c_{2} = E_{pk}(m_{2}) \), which she then sends to Bob. When Bob decrypts the ciphertext, he gets the plaintext \( m_{1} \cdot m_{2} \: mod \: n \) instead of the original plaintext \( m_{1} \).
Implementations of cryptographic algorithms are vulnerable to side-channel attacks, where an attacker exploits information gained from the physical implementation rather than weaknesses in the algorithm itself. One such attack against RSA is the timing attack, where Eve measures the time it takes for Alice to perform decryption operations and uses these timing differences to extract information about Alice's private key.
Timing attacks against RSA work because the time required to perform modular exponentiation during decryption depends on the value of the private key \( d \) and the ciphertext \( c \). By carefully analyzing how decryption time varies across different inputs, Eve can gradually reconstruct the secret key.
RSA blinding is a technique that prevents timing attacks by randomizing the ciphertext before decryption. Here's how it works: When Alice receives a ciphertext \( c \) that she wants to decrypt, she first generates a random value \( r \) such that \( 1 \leq r \leq n-1 \) and \( \gcd(r, n) = 1 \). She computes the "blinded ciphertext" \( c' = c \cdot r^e \: mod \: n \), where \( e \) is her public encryption exponent. Alice then performs the normal decryption on the blinded ciphertext: \( m' = (c')^d \: mod \: n \). Finally, she "unblinds" the result by computing \( m = m' \cdot r^{-1} \: mod \: n \), where \( r^{-1} \) is the inverse of \( r \).
Mathematically, we can prove that this process correctly decrypts the original ciphertext:
\( \eqalign{ m' &= (c')^d \: mod \: n \\ &= (c \cdot r^e)^d \: mod \: n &&(\mbox{definition of blinding}) \\ &= c^d \cdot (r^e)^d \: mod \: n &&(\mbox{exponent rule}) \\ &= c^d \cdot r^{ed} \: mod \: n \\ &= m \cdot r \: mod \: n &&(c^d \: mod \: n = m \text{ and } r^{ed} \: mod \: n = r) } \)
The last step uses the fact that \( r^{ed} \: mod \: n = r \) because \( ed \equiv 1 \: (mod \: \phi(n)) \), which means \( ed = 1 + k\phi(n) \) for some integer \( k \). Therefore, by Euler's theorem, \( r^{ed} = r^{1+k\phi(n)} = r \cdot (r^{\phi(n)})^k = r \cdot 1^k = r \: mod \: n \) (since \( \gcd(r, n) = 1 \)).
When Alice computes \( m = m' \cdot r^{-1} \: mod \: n = (m \cdot r) \cdot r^{-1} \: mod \: n = m \cdot (r \cdot r^{-1}) \: mod \: n = m \cdot 1 \: mod \: n = m \), she recovers the original plaintext.
The critical security benefit of RSA blinding is that the time required to decrypt \( c' \) is no longer correlated with the original ciphertext \( c \), since \( c' \) is randomized by \( r \). This randomization makes timing information useless to Eve because she cannot predict which random value \( r \) Alice will choose for each decryption. Even if Eve sends the same ciphertext multiple times, Alice will use a different random value \( r \) each time, resulting in different blinded ciphertexts and thus different computation patterns.
RSA blinding is an essential countermeasure in practical implementations of RSA to protect against timing attacks and should be incorporated into any security-critical application that uses RSA.
RSA is used in the TLS handshake process for secure web browsing, where it helps establish secure sessions for websites.
RSA is used in digital certificates (X.509) to verify the identity of websites and software publishers.
Key creation | |
Alice:
|
|
Alice sends the public key \( pk = (n, e) \) to Bob. | |
Encryption | |
Bob:
|
|
Bob sends the ciphertext \( c \) to Alice. | |
Decryption | |
Alice:
|
Try a demo of the encryption scheme here.
Alice chooses two secret prime numbers \( p = 277 \) and \( q = 317 \), and computes \( n = p \cdot q = 277 \cdot 317 = 87809 \). She then selects the public encryption exponent \( e = 19907 \) such that \( \gcd(e, \phi(n)) = 1 \), where \( \phi(n) = (p-1) \cdot (q-1) = (277-1) \cdot (317-1) = 87216 \). Finally, she publishes the public key \( pk = (n, e) = (87809, 19907) \).
Bob, a friend of Alice, finds Alice's public key because he wants to send her a secret message. First, he converts the message into the integer \( m = 12345 \), satisfying \( 1 \leq m \leq n-1 \). He then uses Alice's public key to compute the ciphertext \( c = E_{pk}(m) = m^{e} \: mod \: n = 12345^{19907} \: mod \: 87809 = 66975 \), which he sends to Alice.
Before Alice can decrypt the received ciphertext, she needs the secret decryption exponent \( d \), which is part of the secret key \( sk = (n, d) \). She uses the extended Euclidean algorithm, which gives her the equation \( \gcd(e, \phi(n)) = e \cdot \lambda + \phi(n) \cdot \mu = 19907 \cdot 18011 + 87216 \cdot (-4111) = 1 \), where the secret decryption exponent is \( d = \lambda \; mod \; \phi(n) = 18011 \; mod \; 87216 = 18011 \). Alice then uses the secret decryption exponent \( d = 18011 \) to compute the plaintext \( m' = D_{sk}(c) = c^{d} \: mod \: n = 66975^{18011} \: mod \: 87809 = 12345 \).
Two people, Samantha and Victor, use a digital signature scheme when one of them, say Samantha, needs to send a digitally signed piece of data, such as a document or message, to Victor, and it is important that Victor knows it is from Samantha. For example, Victor may have installed a program developed by Samantha's software company, and the company then publishes a new update. Before Victor installs the new update, he wants to make sure that the update is from Samantha's software company and not from the malicious hacker Eve, who tries to install viruses on people's computers disguised as updates from Samantha's software company.
One such digital signature scheme is the RSA signature scheme, which is more or less the same as the RSA encryption scheme. Therefore, for the correctness of the digital signature scheme, we refer to the description of the encryption scheme in the previous section.
Comparing RSA Encryption vs. RSA Signature
Aspect | RSA Encryption | RSA Signature |
---|---|---|
Public Key Used By | Sender (Bob/Anyone) | Receiver (Victor/Anyone) |
Private Key Used By | Receiver (Alice) | Signer (Samantha) |
Public Exponent | Encryption exponent (e) | Verification exponent (v) |
Private Exponent | Decryption exponent (d) | Signing exponent (s) |
Security Goal | Confidentiality | Authentication/Integrity |
In the signature scheme, Samantha first chooses two large secret prime numbers \( p \) and \( q \), and computes the product \( n = p \cdot q \). She then chooses the public verification exponent \( v \) such that \( 3 \leq v \leq n-1 \) and \( \gcd(v, \phi(n)) = 1 \), where \( \phi(n) = (p-1) \cdot (q-1) \) is called Euler's phi function. Next, she uses the extended Euclidean algorithm to compute the secret signing exponent \( s \). The extended Euclidean algorithm gives her the equation \( v \cdot \lambda + \phi(n) \cdot \mu = \gcd(v, \phi(n)) \), where \( s = \lambda \; mod \; \phi(n) \) is the inverse of \( v \) because \( v \cdot s \: mod \: \phi(n) = 1 \) (remember that in the RSA case, \( s \) only exists because \( \gcd(v, \phi(n)) = 1 \)). Samantha then publishes the public key \( pk = (n, v) \) and stores the secret key \( sk = (n, s) \).
As you may have noticed, the public verification exponent \( v \) corresponds to the public encryption exponent \( e \) in the encryption scheme, and likewise, the secret signing exponent \( s \) corresponds to the secret decryption exponent \( d \).
To sign a message \( m \), Samantha first computes the fingerprint \( \mathcal{H}(m) \) of the message using a publicly known hash function \( \mathcal{H} \), such that \( 1 \leq \mathcal{H}(m) \leq n-1 \). One property of a hash function is that, given the same input, it always returns the same output. Another property is that the input can be of arbitrary length, but the output will always have a fixed length. Therefore, no matter how large the message Samantha wants to sign, the signing algorithm will be equally efficient because Samantha signs the fingerprint \( \mathcal{H}(m) \) instead of the entire message \( m \).
After Samantha has computed the fingerprint \( \mathcal{H}(m) \) of the message \( m \), she signs the fingerprint using the RSA signing algorithm \( S \) with the secret signing key \( sk = (n, s) \), by computing \( \sigma = S_{sk}(\mathcal{H}(m)) = \mathcal{H}(m)^{s} \: mod \: n \), where \( \sigma \) is called the signature of \( \mathcal{H}(m) \). She then sends both the message \( m \) and the signature \( \sigma \) to Victor.
It is important that Samantha signs the fingerprint instead of the message itself, because otherwise Eve could easily forge a signature by exploiting the partially homomorphic property of the RSA cryptosystem (see the previous section about the homomorphic property of RSA). Assume that Samantha has computed the signatures \( \sigma_{1} = S_{sk}(m_{1}) \) and \( \sigma_{2} = S_{sk}(m_{2}) \) (note that these signatures are not of the fingerprints). If Eve intercepts the two messages \( m_{1} \) and \( m_{2} \) and their signatures \( \sigma_{1} \) and \( \sigma_{2} \) (when Samantha tries to send them to Victor), she can easily compute a valid signature for the message \( m_{1} \cdot m_{2} \: mod \: n \) by calculating \( \sigma_{1} \cdot \sigma_{2} \: mod \: n \), because:
\( \eqalign{ S_{sk}(m_{1}) \cdot S_{sk}(m_{2}) &= m_{1}^{s} \cdot m_{2}^{s} \: mod \: n &&(\mbox{exponent rule}) \\ &= (m_{1} \cdot m_{2})^{s} \: mod \: n \\ &= S_{sk}(m_{1} \cdot m_{2} \: mod \: n) } \)
However, if Samantha computes the signatures of the fingerprints instead, i.e., \( \sigma_{1} = S_{sk}(\mathcal{H}(m_{1})) \) and \( \sigma_{2} = S_{sk}(\mathcal{H}(m_{2})) \), then the signature \( \sigma_{1} \cdot \sigma_{2} \: mod \: n \) is no longer a valid signature for \( m_{1} \cdot m_{2} \: mod \: n \), because \( \mathcal{H}(m_{1}) \cdot \mathcal{H}(m_{2}) \neq \mathcal{H}(m_{1} \cdot m_{2}) \). This is a slight abuse of notation, but the key point is that the hash function is collision-resistant.
\( \eqalign{ S_{sk}(\mathcal{H}(m_{1})) \cdot S_{sk}(\mathcal{H}(m_{2})) &= \mathcal{H}(m_{1})^{s} \cdot \mathcal{H}(m_{2})^{s} \: mod \: n &&(\mbox{exponent rule}) \\ &= (\mathcal{H}(m_{1}) \cdot \mathcal{H}(m_{2}))^{s} \: mod \: n \\ &\neq (\mathcal{H}(m_{1} \cdot m_{2}))^{s} \: mod \: n \\ &= S_{sk}(\mathcal{H}(m_{1} \cdot m_{2} \: mod \: n)) } \)
Another possible attack by Eve, if Samantha signs messages instead of fingerprints, is for Eve to choose a random signature \( \sigma \) and then compute the message as \( \sigma^{v} \: mod \: n = m \) (recall that the public key is \( pk = (n, v) \)). Now, \( \sigma \) is a valid signature for \( m \); in other words, Eve has forged Samantha's signature. However, if a secure hash function \( \mathcal{H} \) is used, then \( \sigma^{v} \: mod \: n = \mathcal{H}(m) \), and it is hard for Eve to compute \( m \) from \( \mathcal{H}(m) \), i.e., to find a preimage.
Before Victor can verify the signature \( \sigma \) on the message \( m \), he uses the same hash function \( \mathcal{H} \) as Samantha to compute the fingerprint \( \mathcal{H}(m) \) of the message. Victor then verifies the signature by computing \( \sigma^{v} \: mod \: n \) and checking that it is equal to \( \mathcal{H}(m) \). The signature is valid only if \( \sigma^{v} \: mod \: n = \mathcal{H}(m) \).
Software developers sign their code to prove authenticity and prevent malware distribution.
Legal contracts and PDFs are digitally signed to verify their origin and prevent tampering.
CAs use digital signatures to issue certificates that verify website identities.
Key creation | |
Samantha:
|
|
Samantha sends the public key \( pk = (n, v) \) to Victor. | |
Signing | |
Samantha:
|
|
Samantha sends both the signature \( \sigma \) and the message \( m \) to Victor. | |
Verification | |
Victor:
|
Try a demo of the signature scheme here.
Samantha chooses two secret prime numbers \( p = 191 \) and \( q = 373 \), and computes \( n = p \cdot q = 191 \cdot 373 = 71243 \). She then selects the public verification exponent \( v = 28261 \) such that \( \gcd(v, \phi(n)) = 1 \), where \( \phi(n) = (p-1) \cdot (q-1) = (191-1) \cdot (373-1) = 70680 \). Finally, she publishes the public key \( pk = (n, v) = (71243, 28261) \).
Before Samantha signs the message \( m = \mbox{"This update is authorized by Samantha"} \), she computes the fingerprint \( \mathcal{H}(m) = 14556 \) of the message using a publicly known hash function \( \mathcal{H} \), such that \( 1 \leq \mathcal{H}(m) \leq n-1 \). She also computes the secret signing exponent \( s \) using the extended Euclidean algorithm. The extended Euclidean algorithm gives her the equation \( \gcd(v, \phi(n)) = v \cdot \lambda + \phi(n) \cdot \mu = 28261 \cdot 15421 + 70680 \cdot (-6166) = 1 \), where the secret signing exponent is \( s = \lambda \; mod \; \phi(n) = 15421 \; mod \; 70680 = 15421 \).
Samantha then uses the secret key \( sk = (n, s) = (71243, 15421) \) to compute the signature \( \sigma = S_{sk}(\mathcal{H}(m)) = \mathcal{H}(m)^{s} \: mod \: n = 14556^{15421} \: mod \: 71243 = 41455 \) of the fingerprint, which she sends to Victor together with the message \( m \).
Victor uses the same hash function \( \mathcal{H} \) as Samantha to compute the fingerprint \( \mathcal{H}(m) = 14556 \) of the message. He then uses Samantha's public key to verify that \( \mathcal{H}(m) = 14556 \) is equal to \( \sigma^{v} \: mod \: n = 41455^{28261} \: mod \: 71243 = 14556 \).