← Back

How SHA-256 Works

Table of Contents

Preprocessing

SHA-256 requires that the input data can be split into 512-bit blocks. If the input data doesn't fulfill this requirement, SHA-256 will first add padding.

The input message is padded so its length becomes a multiple of 512 bits. The padding consists of a 1 bit, followed by enough 0 bits, followed by a 64-bit representation of the original message length.

For example, the string "hello" (40 bits) is padded to a full 512-bit block:

$$ \underbrace{\text{message}}_{\text{40 bits}} \| 1 \| \underbrace{0 \cdots 0}_{\text{407 zeros}} \| \underbrace{\text{length}}_{\text{64 bits}} = 512 \text{ bits} $$

Parsing

After padding, the message is parsed into \(N\) 512-bit blocks, denoted \(M^{(1)}, \ldots, M^{(N)}\).

Each block can be split into sixteen 32-bit words. We denote the \(t\)-th 32-bit word in block \(i\) as \(M_t^{(i)}\) with \(t \in \{0, \ldots, 15\}\).

Setting the Constants

Before the hashing process, we have to initialize the required constants.

Setting the Initial Hash Value \(H^{(0)}\)

\(H^{(0)}\) consists of eight 32-bit hexadecimal words. These words were obtained by taking the first thirty-two bits of the fractional parts of the square roots of the first eight prime numbers.

$$ \begin{aligned} H_0^{(0)} &= \text{6a09e667} \quad (\sqrt{2}) \\ H_1^{(0)} &= \text{bb67ae85} \quad (\sqrt{3}) \\ H_2^{(0)} &= \text{3c6ef372} \quad (\sqrt{5}) \\ H_3^{(0)} &= \text{a54ff53a} \quad (\sqrt{7}) \\ H_4^{(0)} &= \text{510e527f} \quad (\sqrt{11}) \\ H_5^{(0)} &= \text{9b05688c} \quad (\sqrt{13}) \\ H_6^{(0)} &= \text{1f83d9ab} \quad (\sqrt{17}) \\ H_7^{(0)} &= \text{5be0cd19} \quad (\sqrt{19}) \end{aligned} $$

Setting the Round Constants \(K^{256}\)

These are the first 32 bits of the fractional parts of the cube roots of the first 64 prime numbers.

$$ \begin{aligned} K^{256}_{0} &= \text{428a2f98} \quad (\sqrt[3]{2}) \\ K^{256}_{1} &= \text{71374491} \quad (\sqrt[3]{3}) \\ K^{256}_{2} &= \text{b5c0fbcf} \quad (\sqrt[3]{5}) \\ K^{256}_{3} &= \text{e9b5dba5} \quad (\sqrt[3]{7}) \\ &\;\vdots \\ K^{256}_{63} &= \text{c67178f2} \quad (\sqrt[3]{311}) \end{aligned} $$

Helper Functions

As a prerequisite, we will be using the following helper functions in our hashing.

Choose Function \(\text{Ch}\)

For each bit position, if the corresponding bit of \(e\) is 1, the bit from \(f\) is selected. If it's 0, the bit from \(g\) is selected. So \(e\) acts as a selector between \(f\) and \(g\).

$$ \text{Ch}(e, f, g) = (e \wedge f) \oplus (\neg e \wedge g) $$

Majority Function \(\text{Maj}\)

For each bit position, output whichever value (0 or 1) appears in at least two of the three inputs. It's a bitwise vote.

$$ \text{Maj}(a, b, c) = (a \wedge b) \oplus (a \wedge c) \oplus (b \wedge c) $$

Big Sigma Functions \(\Sigma\)

Used in the compression loop. They rotate a word by three different amounts and XOR the results together, spreading each bit's influence across the entire word.

$$ \begin{aligned} \Sigma_0(a) &= \text{ROTR}^{2}(a) \oplus \text{ROTR}^{13}(a) \oplus \text{ROTR}^{22}(a) \\[6pt] \Sigma_1(e) &= \text{ROTR}^{6}(e) \oplus \text{ROTR}^{11}(e) \oplus \text{ROTR}^{25}(e) \end{aligned} $$

Small Sigma Functions \(\sigma\)

Used in the message schedule to expand 16 input words into 64. They combine rotations and shifts of a word, then XOR the results together.

$$ \begin{aligned} \sigma_0(x) &= \text{ROTR}^{7}(x) \oplus \text{ROTR}^{18}(x) \oplus \text{SHR}^{3}(x) \\[6pt] \sigma_1(x) &= \text{ROTR}^{17}(x) \oplus \text{ROTR}^{19}(x) \oplus \text{SHR}^{10}(x) \end{aligned} $$

Hashing

All additions in SHA-256 are performed modulo \(2^{32}\) (i.e., wrap around at 32 bits).

Hashing consists of 4 steps.

Message Schedule \(W_t\)

The message schedule takes the 16 words from the current block and mixes them into 64 words. This ensures that every bit of the original input influences many rounds of the compression function, not just one.

For each block \(M^{(i)}\), where \(i \in \{1, \ldots, N\}\), expand the 16 input words \(M^{(i)}_t\) where \(t \in \{0, \ldots, 15\}\) into 64 words \(W_t\) where \(t \in \{0, \ldots, 63\}\):

$$ W_t = \begin{cases} M^{(i)}_t & 0 \le t \le 15 \\[12pt] \sigma_1(W_{t-2}) + W_{t-7} + \sigma_0(W_{t-15}) + W_{t-16} & 16 \le t \le 63 \end{cases} $$

Compression Function

The compression function is the core of SHA-256. It takes the current hash state and the 64 scheduled words and repeatedly mixes them over 64 rounds. Each round combines the working variables with a scheduled word and a round constant, creating a complex dependency chain that makes the output unpredictable from the input.

Initialize eight working variables from the previous block's hash output:

$$ \begin{aligned} a &\leftarrow H_0^{(i-1)} \\ b &\leftarrow H_1^{(i-1)} \\ c &\leftarrow H_2^{(i-1)} \\ d &\leftarrow H_3^{(i-1)} \\ e &\leftarrow H_4^{(i-1)} \\ f &\leftarrow H_5^{(i-1)} \\ g &\leftarrow H_6^{(i-1)} \\ h &\leftarrow H_7^{(i-1)} \end{aligned} $$

Then for each round \(t = 0\) to \(63\):

$$ \begin{gathered} T_1 = h + \Sigma_1(e) + \text{Ch}(e, f, g) + K^{256}_t + W_t \\[6pt] T_2 = \Sigma_0(a) + \text{Maj}(a, b, c) \\[14pt] h \leftarrow g \\[4pt] g \leftarrow f \\[4pt] f \leftarrow e \\[4pt] e \leftarrow d + T_1 \\[4pt] d \leftarrow c \\[4pt] c \leftarrow b \\[4pt] b \leftarrow a \\[4pt] a \leftarrow T_1 + T_2 \end{gathered} $$

Intermediate Hash Update

After the compression function finishes, its output is added back to the previous hash value. This is the Davies–Meyer construction. It ensures the function is non-invertible, meaning even if an attacker knows the output, they cannot recover the input. Without this step, the compression function could simply be run in reverse.

$$ \begin{aligned} H_0^{(i)} &= H_0^{(i-1)} + a \\[4pt] H_1^{(i)} &= H_1^{(i-1)} + b \\[4pt] H_2^{(i)} &= H_2^{(i-1)} + c \\[4pt] H_3^{(i)} &= H_3^{(i-1)} + d \\[4pt] H_4^{(i)} &= H_4^{(i-1)} + e \\[4pt] H_5^{(i)} &= H_5^{(i-1)} + f \\[4pt] H_6^{(i)} &= H_6^{(i-1)} + g \\[4pt] H_7^{(i)} &= H_7^{(i-1)} + h \end{aligned} $$

Final Hash

After processing all \(N\) blocks, concatenate the eight 32-bit words:

$$ H = H_0^{(N)} \;\|\; H_1^{(N)} \;\|\; H_2^{(N)} \;\|\; H_3^{(N)} \;\|\; H_4^{(N)} \;\|\; H_5^{(N)} \;\|\; H_6^{(N)} \;\|\; H_7^{(N)} $$

This produces the final 256-bit (32-byte) hash digest.

Comments