In cryptography, we often encrypt and sign messages that contain characters. However, asymmetric key cryptosystems (where Alice and Bob use different keys), such as RSA and ElGamal, are based on arithmetic operations on integers. Symmetric key cryptosystems (where Alice and Bob use the same key), such as DES and AES, are based on bitwise operations on bits (a bit is either 0 or 1 and is an abbreviation for binary digit). Therefore, we must convert characters into integers or bits before applying the cryptosystem to the message.
Because a computer can only handle bits, there already exists a method for converting characters into integers or bits, which uses the ASCII (American Standard Code for Information Interchange) table. A part of the ASCII table from Wikipedia is shown in the following table (note that the binary representation on the Wikipedia site uses only seven bits, but we use eight bits by prepending an extra 0):
Character
Decimal
Hexadecimal
Binary
(space)
32
20
00100000
!
33
21
00100001
\( \vdots \)
\( \vdots \)
\( \vdots \)
\( \vdots \)
A
65
41
01000001
B
66
42
01000010
\( \vdots \)
\( \vdots \)
\( \vdots \)
\( \vdots \)
a
97
61
01100001
b
98
62
01100010
\( \vdots \)
\( \vdots \)
\( \vdots \)
\( \vdots \)
For example, if Alice wants to send the message "Hey Bob!" encrypted to Bob using a symmetric key cryptosystem, she first converts it into its integer representation and then into its binary representation:
\( \eqalign{ H &\rightarrow 72 &&\rightarrow 01001000 \\ e &\rightarrow 101 &&\rightarrow 01100101 \\ y &\rightarrow 121 &&\rightarrow 01111001 \\ &\rightarrow 32 &&\rightarrow 00100000 \\ B &\rightarrow 66 &&\rightarrow 01000010 \\ o &\rightarrow 111 &&\rightarrow 01101111 \\ b &\rightarrow 98 &&\rightarrow 01100010 \\ ! &\rightarrow 33 &&\rightarrow 00100001 } \)
That is, the message "Hey Bob!" is represented with integers as "72 101 121 32 66 111 98 33" and in binary as "01001000 01100101 01111001 00100000 01000010 01101111 01100010 00100001".
While ASCII is sufficient for basic English text, modern cryptographic applications often need to handle international characters. Unicode standards like UTF-8 extend ASCII and can encode virtually all characters from all writing systems. In UTF-8, ASCII characters still use a single byte, while other characters may use 2-4 bytes.
Modern cryptographic algorithms typically operate on bytes (8-bit units) or words (groups of bytes). The binary representation shown above organizes characters into bytes, which serve as the fundamental unit for cryptographic operations in algorithms like AES, where data is processed in 128-bit (16-byte) blocks.
Notice the relationship between the different representations:
The decimal form is what computers process mathematically
The hexadecimal form (base 16) provides a more compact representation where each hex digit represents exactly 4 bits
The binary form (base 2) shows the actual bit pattern stored in memory
It's important to note that encoding (converting characters to numbers or binary) is not the same as encryption. Encoding is a standardized, reversible process with no secret key. Anyone knowing the encoding scheme can decode the message. Encryption, on the other hand, requires a key to reverse the process, providing actual security.
Related Encoding Schemes in Cryptography
Several other encoding schemes are commonly used in cryptographic applications:
Base64: Often used to encode binary data (like encryption keys or encrypted messages) into ASCII text for safe transmission over text-based channels
Hex encoding: Frequently used to represent hashes, keys, and other binary values in a human-readable format
Block padding: Many cryptographic algorithms operate on fixed-size blocks (e.g., AES uses 128-bit blocks), requiring messages to be padded to the appropriate length
The cryptography explained
Digital signatures are fundamental cryptographic primitives that serve three critical purposes: authentication (verifying who signed a message),
non-repudiation (preventing signers from denying their signatures), and integrity protection (detecting message tampering).
Unlike physical signatures, digital signatures are unique to each message and utilize mathematical techniques to cryptographically
bind a message to the signer's identity.
Traditional signature schemes such as RSA, DSA, and ECDSA rely on computational problems (factoring large numbers or solving discrete logarithms)
believed to be intractable for classical computers. However, Shor's algorithm running on a sufficiently powerful quantum computer could solve
these problems efficiently, rendering these signature schemes insecure. This looming threat has accelerated research into post-quantum
cryptography, with hash-based signatures emerging as one of the most promising approaches.
One-Time Signatures
One-time signature (OTS) schemes are cryptographic systems designed with a critical security constraint: each key pair must be used
exactly once. Reusing a key pair to sign multiple messages compromises the scheme's security fundamentally, allowing attackers to forge
signatures. Despite this significant limitation, OTS schemes offer compelling advantages:
Quantum resistance - Built primarily on cryptographic hash functions, which remain secure against known quantum algorithms
Mathematical simplicity - Based on straightforward cryptographic primitives with minimal mathematical complexity
Strong security proofs - Security can be directly reduced to the collision and preimage resistance of the underlying hash functions
Minimal assumptions - Requires only the existence of secure one-way functions, one of cryptography's most conservative assumptions
The primary challenge of OTS schemes lies in key management—generating, storing, and tracking single-use key pairs efficiently.
This limitation initially restricted their practical applications but led to the development of more versatile constructions.
Many-Time Signatures
Many-time signature schemes elegantly extend one-time signatures by allowing multiple signatures under a single public key.
The breakthrough concept, introduced by Ralph Merkle in 1979, uses tree structures (now known as Merkle trees) to authenticate
multiple one-time signature keys under a single master public key, dramatically improving practicality.
These schemes balance several crucial factors:
Long-term security - Maintaining resistance against both classical and quantum adversaries
Signature efficiency - Minimizing signature size while supporting authentication paths
Key management - Reducing storage requirements for public and private keys
Computational performance - Optimizing signature generation and verification algorithms
State management - Tracking which one-time keys have been used to prevent catastrophic reuse
Modern variants like XMSS (eXtended Merkle Signature Scheme) and SPHINCS+ represent the state of the art in hash-based
signatures and are among the most promising candidates for standardization in the post-quantum era.
The Lamport-Diffie one-time signature explained
Proposed by Leslie Lamport and developed with Whitfield Diffie in 1979, the Lamport-Diffie one-time signature scheme stands
as a foundational hash-based signature algorithm that remains secure against quantum attacks.
Core Concept
The fundamental insight of the Lamport-Diffie scheme is remarkably elegant: for each bit position in a message,
the signer prepares two secret values but only reveals one based on whether that bit is 0 or 1.
This selective disclosure creates a cryptographic commitment that cannot be altered without knowing the unrevealed secret values.
Key Generation
For a message length of \(m\) bits and a security parameter \(n\) (typically 256 bits):
Generate \(2m\) random \(n\)-bit values organized as pairs: \(x_{i,0}\) and \(x_{i,1}\) for each position \(i \in \{0, 1, ..., m-1\}\)
\(x_{i,0}\) will be used if the \(i\)th message bit is 0
\(x_{i,1}\) will be used if the \(i\)th message bit is 1
Apply a cryptographic hash function \(\mathcal{H}\) to each value: \(y_{i,j} = \mathcal{H}(x_{i,j})\) for all \(i \in \{0,...,m-1\}\), \(j \in \{0,1\}\)
The private key consists of all \(2m\) random values: \(sk = \{x_{0,0}, x_{0,1}, x_{1,0}, x_{1,1}, ..., x_{m-1,0}, x_{m-1,1}\}\)
The public key consists of all \(2m\) hash values: \(pk = \{y_{0,0}, y_{0,1}, y_{1,0}, y_{1,1}, ..., y_{m-1,0}, y_{m-1,1}\}\)
Signature Generation
To sign a message \(M = (M_0, M_1, ..., M_{m-1})\) where each \(M_i\) is a bit:
For each bit position \(i\), select the private key value \(x_{i,M_i}\) corresponding to the bit value
The signature is the sequence of these selected values: \(\sigma = (x_{0,M_0}, x_{1,M_1}, ..., x_{m-1,M_{m-1}})\)
Critical security note: Each key pair must be used exactly once, as reuse would reveal both \(x_{i,0}\) and \(x_{i,1}\) for some positions, allowing forgeries
Signature Verification
Given a message \(M\), a signature \(\sigma = (s_0, s_1, ..., s_{m-1})\), and public key \(pk\):
For each bit position \(i\) in the message:
Apply the hash function to the signature component: \(z_i = \mathcal{H}(s_i)\)
Verify that \(z_i\) matches the corresponding public key component \(y_{i,M_i}\)
If all hash values match, the signature is valid; otherwise, it's invalid
Security Considerations
The security of the Lamport-Diffie scheme is directly tied to:
Preimage resistance of the hash function: An attacker who wants to forge a signature for a 0 bit would need
to find a preimage for the corresponding hash value \(y_{i,0}\), which is computationally infeasible with secure hash functions
Strict one-time use: Using the same key pair to sign different messages reveals enough private key information
to allow an attacker to forge signatures for other messages
Hash function collision resistance: Although not strictly required for the basic security proof, collision resistance
provides additional security margin
The protocol
Key Generation
Samantha:
Chooses a secure hash function \(\mathcal{H}\) (e.g., SHA-256) with \(n\)-bit output
For a message length of \(m\) bits:
Generates \(2m\) random values \(x_{i,j}\) (each \(n\) bits long)
For \(i = 0\) to \(m-1\) and \(j \in \{0,1\}\): Computes \(y_{i,j} = \mathcal{H}(x_{i,j})\)
Private key: \(sk = \{x_{i,j}\}\) for all \(i,j\)
Public key: \(pk = \{y_{i,j}\}\) for all \(i,j\)
Samantha sends the public key \(pk = \{y_{i,j}\}\) to Victor.
Signing
Samantha:
Converts plaintext message to binary: \(M = (M_0, M_1, ..., M_{m-1})\) where each \( m_i \) is 0 or 1
For each bit position \(i = 0\) to \(m-1\):
Selects \(x_{i,M_i}\) based on the value of bit \(M_i\)
Forms signature \(\sigma = (x_{0,M_0}, x_{1,M_1}, ..., x_{m-1,M_{m-1}})\)
Sends message \(M\) and signature \(\sigma\) to Victor
Permanently destroys this key pair to prevent reuse
If all checks pass, accepts the signature as valid
If any check fails, rejects the signature as invalid
Example
Let's illustrate the Lamport-Diffie signature with a simplified example using a 4-bit message and a toy hash function.
Setup:
For simplicity, we'll use a toy 8-bit hash function \(\mathcal{H}\) that just takes the first 8 bits of SHA-256.
Key Generation:
Samantha generates 8 random values (2 for each bit position):
\(x_{0,0}\) = 0x42 | \(y_{0,0}\) = \(\mathcal{H}(x_{0,0})\) = 0xA1 (for bit 0, value 0)
\(x_{0,1}\) = 0x17 | \(y_{0,1}\) = \(\mathcal{H}(x_{0,1})\) = 0xB8 (for bit 0, value 1)
\(x_{1,0}\) = 0xF3 | \(y_{1,0}\) = \(\mathcal{H}(x_{1,0})\) = 0x72 (for bit 1, value 0)
\(x_{1,1}\) = 0x8D | \(y_{1,1}\) = \(\mathcal{H}(x_{1,1})\) = 0xE9 (for bit 1, value 1)
\(x_{2,0}\) = 0x3A | \(y_{2,0}\) = \(\mathcal{H}(x_{2,0})\) = 0x5F (for bit 2, value 0)
\(x_{2,1}\) = 0xC4 | \(y_{2,1}\) = \(\mathcal{H}(x_{2,1})\) = 0x9D (for bit 2, value 1)
\(x_{3,0}\) = 0x56 | \(y_{3,0}\) = \(\mathcal{H}(x_{3,0})\) = 0x37 (for bit 3, value 0)
\(x_{3,1}\) = 0xE2 | \(y_{3,1}\) = \(\mathcal{H}(x_{3,1})\) = 0xD4 (for bit 3, value 1)
Private key: All x values
Public key: All y values (0xA1, 0xB8, 0x72, 0xE9, 0x5F, 0x9D, 0x37, 0xD4)
Signing:
To sign the message \(M = 1010\) (binary), Samantha selects:
For bit 0 (1): \(x_{0,1}\) = 0x17
For bit 1 (0): \(x_{1,0}\) = 0xF3
For bit 2 (1): \(x_{2,1}\) = 0xC4
For bit 3 (0): \(x_{3,0}\) = 0x56
Signature: (0x17, 0xF3, 0xC4, 0x56)
Verification:
Victor verifies the signature:
For bit 0 (1): \(\mathcal{H}(0x17)\) = 0xB8 \(\stackrel{?}{=}\) \(y_{0,1}\) = 0xB8
For bit 1 (0): \(\mathcal{H}(0xF3)\) = 0x72 \(\stackrel{?}{=}\) \(y_{1,0}\) = 0x72
For bit 2 (1): \(\mathcal{H}(0xC4)\) = 0x9D \(\stackrel{?}{=}\) \(y_{2,1}\) = 0x9D
For bit 3 (0): \(\mathcal{H}(0x56)\) = 0x37 \(\stackrel{?}{=}\) \(y_{3,0}\) = 0x37
Since all hash values match, the signature is valid.
Security Note:
If Samantha uses the same key pair to sign another message, she would reveal more private key values.
For example, signing 1011 would reveal \(x_{3,1}\), allowing an attacker to forge signatures for any message
that starts with 101, undermining the scheme's security.
The Merkle many-time signature explained
In 1979, Ralph Merkle introduced a many-time signature scheme that transformed the limited one-time signatures into practical authentication systems through a hierarchical structure now universally known as the Merkle tree—a fundamental data structure that has since become essential across blockchain, distributed systems, and cryptography.
Core Concept
The Merkle signature scheme elegantly solves the critical limitation of one-time signatures—their single-use restriction—by creating a system where multiple messages can be signed under a single master public key while maintaining quantum resistance:
Generating a large pool of one-time signature key pairs (such as Lamport-Diffie pairs)
Binding these keys cryptographically in a binary hash tree (Merkle tree) structure
Publishing only a single compact master public key (the tree root)
Including efficient authentication paths with each signature to validate individual one-time keys
Merkle Tree Structure
The Merkle tree provides a cryptographic commitment to many values using a single hash and is constructed as follows:
Leaf nodes: Each leaf contains the hash of a one-time public key, anchoring it to the tree
Internal nodes: Each internal node is computed as the hash of its two child nodes' concatenation: \(parent = \mathcal{H}(left\_child \| right\_child)\) where \(\mathcal{H}\) is a hash function
Root node: The single root hash serves as the compact master public key, cryptographically committing to all one-time keys
Authentication paths: For each leaf, the logarithmic-sized path provides the minimum information needed to verify its inclusion in the tree
Key Generation
For a tree of height \(h\) supporting \(2^h\) signatures:
Compute leaf nodes \(L_i = \mathcal{H}(pk_i)\) where \(\mathcal{H}\) is a cryptographic hash function
Build the tree bottom-up:
Level \(h\): Place leaf nodes \(L_0\) through \(L_{2^h-1}\)
Level \(j\) (for \(j\) from \(h-1\) down to 0): Compute each node as \(N_{j,k} = \mathcal{H}(N_{j+1,2k} \| N_{j+1,2k+1})\)
Continue until computing the single root node at level 0
The private key comprises all one-time secret keys \(sk_i\) and the tree structure
The public key is simply the root hash \(N_{0,0}\) of the Merkle tree
Signature Generation
To sign the \(i\)-th message:
Use the \(i\)-th one-time key pair \((sk_i, pk_i)\) to generate signature \(\sigma_i\) using the underlying one-time scheme
Extract the authentication path from leaf \(i\) to the root:
For each level, include the sibling hash of the node on the path from leaf \(i\) to the root
This provides the verifier with exactly the \(h\) additional hash values needed to recompute the root
The complete signature consists of:
The index \(i\) identifying which one-time key is being used
The one-time public key \(pk_i\) (enabling verification of the one-time signature)
The one-time signature \(\sigma_i\) of the message
The authentication path proving \(pk_i\)'s inclusion in the tree
Critically, mark the \(i\)-th key pair as used to prevent catastrophic reuse
Signature Verification
To verify a signature, the verifier:
Verifies the one-time signature \(\sigma_i\) against the provided one-time public key \(pk_i\)
Computes the leaf node value \(L_i = \mathcal{H}(pk_i)\)
Reconstructs the root hash using the authentication path:
Uses the index \(i\)'s binary representation to determine left/right positions at each level
For each level \(j\) from \(h-1\) to 0, computes the parent by hashing the current value with the authentication path node in the correct order (determined by the bit value of \(i\) at that level)
Compares the computed root hash with the signer's public key
Accepts the signature as valid only if both the one-time signature verification and the root hash comparison succeed
Advantages and Practical Considerations
The Merkle signature scheme offers several significant advantages:
Post-quantum security: Based solely on hash functions, resistant to Shor's and other known quantum algorithms
Strong security foundation: Security reduces directly to collision and preimage resistance of the hash function
Flexible signature capacity: Tree height can be adjusted based on application needs
Logarithmic authentication overhead: Authentication paths grow only logarithmically with the number of signatures
Important practical considerations include:
State management: Maintaining the signing state is critical—key reuse creates exploitable forgeries
Storage requirements: Efficient implementations can reduce the \(O(2^h)\) private key storage through tree traversal techniques
Modern variants: XMSS (eXtended Merkle Signature Scheme) and SPHINCS+ have refined this approach with hypertrees and few-time signatures
Standardization: XMSS is standardized in RFC 8391, and SPHINCS+ is a finalist in the NIST post-quantum cryptography standardization process
This elegant construction transforms the theoretical one-time signatures into practical, quantum-resistant signature schemes by using tree structures to efficiently authenticate many one-time keys—a testament to Merkle's influential contribution to cryptography.
The protocol
Key Generation
Samantha:
Selects a cryptographic hash function \(\mathcal{H}\) and tree height \(h\)
Generates \(2^h\) one-time signature key pairs \((sk_i, pk_i)\) for \(i = 0\) to \(2^h-1\)
Constructs a binary Merkle tree:
Computes leaf nodes: \(L_i = \mathcal{H}(pk_i)\) for each one-time public key
For each level \(j\) from \(h-1\) down to 0: Computes each node as \(N_{j,k} = \mathcal{H}(N_{j+1,2k} \| N_{j+1,2k+1})\)
The root of the tree \(N_{0,0}\) becomes the public key
The private key consists of all one-time key pairs and the tree structure
Samantha sends only the public key \(N_{0,0}\) to Victor.
Signing
Samantha:
To sign the \(i\)-th message \(M\):
Uses the \(i\)-th one-time key pair \((sk_i, pk_i)\) to create signature \(\sigma_i\) of \(M\)
Identifies index \(i\) in binary form to determine path through tree
Extracts authentication path \(Auth_i\) from leaf \(i\) to root:
For each level, includes the sibling node of the path node
Forms the complete signature \(S = (i, pk_i, \sigma_i, Auth_i)\)
Sends message \(M\) and signature \(S\) to Victor
Marks the \(i\)-th key pair as used to prevent reuse
The computed root \(0x8A\) matches Samantha's public key, confirming the signature's validity
Signing (Second Message):
For message \(M_2\), Samantha uses the second key pair \((sk_1, pk_1)\):
She creates a one-time signature \(\sigma_1\) using \(sk_1\)
Index \(i=1\) in binary is "01"
The authentication path for leaf 1 consists of:
Level 2: \(L_0 = \text{0xA3}\) (sibling of \(L_1\))
Level 1: \(N_{1,1} = \text{0xF2}\) (sibling of \(N_{1,0}\))
The complete signature is \(S_2 = (1, pk_1, \sigma_1, [0xA3, 0xF2])\)
Security Note:
Samantha can continue signing up to 4 messages with this tree. The authentication paths allow verification
without revealing the entire tree, while the tree structure ensures that each one-time key is authenticated
by the master public key. This elegant construction enables many-time signatures while maintaining the quantum
resistance of the underlying one-time signature scheme.