How Does Bitcoin Use Cryptography?

How Does Bitcoin Use Cryptography?

Introduction

In the previous articles, I described what Bitcoin is and why it was created. Now let's look at how it actually works, starting with its foundation: cryptography.

Cryptography has been around for thousands of years and is a subdomain of mathematics. Its goal is to protect information from manipulation and unauthorized access. It originated in military and diplomatic contexts, but with the rise of digital technology, it became essential to everyday life: from online banking to messaging apps to decentralized networks like Bitcoin.

There will be no math in this article. I will only explain what is relevant to understand the upcoming articles. To understand what cryptography does for Bitcoin, let's first look at what cryptography is designed to do in general.

The Four Goals of Cryptography

Cryptography has four main goals: confidentiality, integrity, authenticity, and non-repudiation.

Let me explain these with a classic example. Alice wants to send Bob a message over an insecure channel, like the internet. Eve can observe and copy the message as it travels through the channel. Mallory might try to alter it along the way.

💡
Alice, Bob, Eve, and Mallory are standard characters in cryptography. Alice and Bob want to communicate. Eve (from "eavesdropper") listens. Mallory (from "malicious") tries to interfere.

Alice encrypts her message before sending it. Eve can see the encrypted data passing through the channel, but she cannot understand its content. This is confidentiality.

When Bob receives the message, he can verify that Mallory has not altered it during transmission. This is integrity.

He can also verify that the message actually came from Alice, not from someone pretending to be her. This is authenticity.

And once Bob has verified all of this, Alice cannot deny having sent the message. This is non-repudiation.

📌
The four goals of cryptography are confidentiality, integrity, authenticity, and non-repudiation. Together, they form the foundation of secure digital communication.

What Bitcoin Uses

Not all four of these goals are relevant to Bitcoin. All data on the Bitcoin network is public. Nothing is encrypted, and anyone can verify everything. Therefore, confidentiality is not a goal of Bitcoin.

What Bitcoin does use is integrity and authenticity. Hash functions ensure that data has not been tampered with. Key pairs prove that a specific person authorized a piece of data through digital signatures. I will explain both concepts in this article.

Non-repudiation follows naturally from authenticity: once data is signed and confirmed, the sender cannot deny having authorized it.

📌
Bitcoin is a fully public system. It does not need confidentiality. Instead, it relies on integrity and authenticity to ensure that data has not been changed and was authorized by the rightful owner.

Hash Functions

Let's start with hash functions.

💡
A function is a mathematical rule that takes an input and produces an output. A hash function is a specific kind of function designed for data processing.

Think of a hash function like a fingerprinting machine. You feed it any piece of data, whether a word, a sentence, or an entire book, and it produces a fixed-size fingerprint. This fingerprint is effectively unique to that input, just as a human fingerprint is effectively unique to a person.

The internals of a cryptographic hash function are not important for understanding Bitcoin. However, you need to understand its properties to see why it makes Bitcoin secure. I will explain these properties with examples and provide a short definition at the end.

Bitcoin uses a hash function called SHA-256. The number 256 refers to the size of the output: 256 bits. When written in hexadecimal notation (using the characters 0–9 and a–f), this translates to a string of exactly 64 characters. This is a much shorter version of a large binary number, and easier for humans to read. Internally it is handled as a binary number.

Let's go through these properties one by one.

💡
You can verify every example yourself using an online SHA-256 calculator like https://sha-generator.com.

Deterministic

Here is a simple example. I pass the word Hello to the SHA-256 hash function and get the following output:

185f8db32271fe25f561a6fc938b2e264306ec304eda518007d1764826381969

What happens if I pass the same word again? I get exactly the same output. A hash function is deterministic: for a given input, it always produces the same output.

Avalanche Effect

So does this mean that if I change even a single character, I will get a different output? Exactly. And not just slightly different. Completely different.

Let's change the last character from lowercase o to uppercase O and pass HellO to the function:

4ff7975b53db6c029d88f6ac67bd78d12fed72cdb2e252a26556d594b87bc9d8

Compare this output with the previous one. You will not find any similarity between them. This is the so-called Avalanche Effect: if you change even a single bit of the input, the output changes completely. To a human, the output appears to be a random string of characters. But remember, it is not random. The same input always produces the same output. The apparent randomness is deliberate, because no one should be able to derive the input from the output.

This is why a hash function is also called a one-way function. You can calculate the output from the input, but not the other way around.

Fixed Output Size

Until now, the input was always a single word with the same number of characters. What happens with a longer string? Let's pass Hello World to the function:

a591a6d40bf420404a011733cfb7b190d62c65bf0bcda32b57b277d9ad9f146e

Compare this with the two outputs above. All three are exactly 64 characters long, the fixed output size of SHA-256. That is another property of hash functions: the output is always the same size, regardless of the input. It does not matter if you provide a single character or the complete works of William Shakespeare. The output will always be the same size.

Speed

Hash functions are extremely fast. Depending on the hardware, it does not matter much whether the input has 10 characters or a million. On a typical computer, the calculation takes a fraction of a second.

This speed is essential. It allows anyone to quickly verify data. It also plays a key role in the mining process, the mechanism that secures the entire network. I will explain how in a later article.

Collision Resistance

Earlier, I said that a hash function produces a fingerprint that is effectively unique to its input. Why "effectively unique" and not simply "unique"?

A hash function maps any input, no matter how large, to an output of fixed size. The number of possible inputs is infinite, but the number of possible outputs is finite. For SHA-256, there are 2^256 possible outputs. Mathematically, this means that different inputs that produce the same output must exist. Such a case is called a collision.

So why do we treat the output as effectively unique? Because 2^256 is an astronomically large number. It is approximately 10^78, a 1 followed by 78 zeros. That is comparable to the estimated number of atoms in the observable universe. Finding two inputs that produce the same SHA-256 output by searching or by chance is so unlikely that it is considered computationally infeasible. No one has ever found a SHA-256 collision, and with current technology, no one is expected to.

This property is called collision resistance. It does not mean collisions are impossible. It means they are so hard to find that, effectively, every input has its own unique fingerprint.

Definition

In summary: a cryptographic hash function is a fast, deterministic, one-way function that maps any input to an effectively unique, fixed-size output.

📌
Whenever you encounter a hash in the upcoming articles, remember: it is a fingerprint that makes any change to the original data detectable.

Hash functions give Bitcoin integrity: the assurance that data has not been changed. But integrity alone is not enough. The network also needs to verify who authorized a transaction. That is the job of key pairs.

Key Pairs

Think of a key pair like a personal seal. In the past, a king would press his seal ring into wax to prove that a document came from him. Anyone could inspect the seal to verify its authenticity, but only the king could create it. As we will see shortly, a digital signature works the same way.

Symmetric vs. Asymmetric Cryptography

To understand key pairs, it helps to know why they were invented. Cryptography can be divided into two major areas: symmetric and asymmetric cryptography. The main difference lies in the number of keys used.

In symmetric cryptography, there is only one key. Alice uses it to encrypt a message, and Bob uses the same key to decrypt it. The problem is obvious: Alice needs to get the key to Bob somehow, without anyone else intercepting it. This is called the key exchange problem, and for a long time, it was a fundamental limitation of cryptography.

Asymmetric cryptography solves this problem by using two keys instead of one.

💡
Asymmetric cryptography was introduced in 1976 by Whitfield Diffie and Martin Hellman. Their paper "New Directions in Cryptography" is considered one of the most influential publications in the history of computer science.

Private Key and Public Key

In asymmetric cryptography, Alice generates a key pair: a private key and a public key. The private key is a very large, randomly generated number. The public key is mathematically derived from it. As the names suggest, the private key must always remain secret, while the public key can be shared with anyone.

Here is an example of a real key pair, generated for demonstration purposes only:

Private key: 66bc9cb9c459f9851248a879c1b9d78af26bc16c32a42c57909b62a0a907c277

Public key: 023cb05cd13d1504d273afaa5d0e8168573debb01a196bf4b9a7f0b7f6852ec755

You might notice that the private key looks similar to the hash outputs from earlier: 64 hexadecimal characters, representing a very large number. The public key is slightly longer because it contains additional information about how it was derived. The derivation itself is a mathematical process that works in only one direction: you can calculate the public key from the private key, but not the other way around. Just like a hash function, it is a one-way operation.

This is the crucial difference from symmetric cryptography: Alice can share her public key openly, even over an insecure channel like the internet. It does not matter if Eve sees it. Security does not depend on keeping the public key secret. It depends entirely on keeping the private key secret.

⚠️
Whoever has the private key has full control. If it is lost, the access is gone. If it is stolen, the thief has the same control as the original owner. There is no recovery mechanism and no one to call for help.
💡
Asymmetric cryptography can also be used for encryption, but Bitcoin does not use this capability. All data on the Bitcoin network is public. What Bitcoin uses is the other ability of key pairs: digital signatures.

Digital Signatures

With her private key, Alice can create a digital signature for a piece of data. This signature is cryptographic evidence that the corresponding private key was used to authorize that specific data. Anyone who has Alice's public key can verify the signature. If the verification succeeds, two things are proven: the data was authorized by Alice (authenticity), and it has not been altered since she signed it (integrity).

This is exactly how the seal ring works. Only the king can press his ring into the wax, but anyone can inspect the seal to confirm it is genuine. If the seal is intact, the document has not been tampered with.

💡
Private key: only known to the owner. Used to create digital signatures.
Public key: known to everyone. Used to verify digital signatures.
How does Bitcoin use digital signatures to authorize transactions? And what exactly is the blockchain? I will answer these questions in the next article.

Conclusion

Bitcoin's security rests on two cryptographic foundations. Hash functions provide integrity: even the smallest modification to any piece of data is immediately detectable. Key pairs provide authenticity: they provide cryptographic evidence of which private key was used to authorize a piece of data, without revealing the key itself. Together, these foundations make it possible to verify everything and minimize the need for trust. That is exactly what a system without a central authority requires.

In the next article, I will explain how Bitcoin applies these concepts to record and verify transactions on the blockchain.

If you have questions or feedback, feel free to reach out. And if you found this article helpful, consider sharing it with others who want to understand Bitcoin.