Hill Cipher and a Variant

Hill Cipher

Author: Joyce_BY, all rights reserved.
Contact by email: [email protected]

What’s Hill Cipher?

the Hill cipher is a polygraphic substitution cipher based on linear algebra.
Its encryption function is described as follows:

Ek(m) = [m] * [K] = [c]

Where [m] and [c] are 1 * n vectors, [K] is n * n

What’s good about Hill Cipher?

The idea behind Hill cipher is that since classical ciphers, like vigenere cipher, can be encrypted through frequency analysis, we should try our best to make letters in cipher text approximate uniform distribution, and this is achieved by applying a matrix transformation to the plain text.

What’s wrong with Hill Cipher?

Owning to its linearity, Hill Cipher is never secure.
It is very easy to get the decryption function as follows:

Dk[c] = [c] * [K]^-1
Notice that the key matrix must be invertible, that is det(K) != 0.
And specially, if we are working on the ring Z26, make sure gcd(|K|,26) = 1

Therefore, if we know the key matrix, we just need to calculate its inverse matrix, do a matrix multiplication, then we can work out the plain text from the cipher text.

In real applications, there is a vast need to get the key from known cipher text and plain text. Because Hill Cipher is linear, this also can be done without much effort. So Hill Cipher can be cracked by chosen plaintext attacks (CPA).

[K] = [m]^-1 * [c]

A Variant from Hill Cipher

Now we use the following encryption function

Ek[m] = [m] * [K] + [b]

Where [m] and [b] are 1 * n vectors, [K] is n * n

In fact, this is just a shift of Hill Cipher.

CPA to this variant

Easily, if we have enough plaintext and ciphertext (n*n), we can solve the key (K,b) as follows:

[C] = [M][K] + [B]
[C1] * [K]^-1 = [M1] + [B] * [K]^-1
[C2] * [K]^-1 = [M2] + [B] * [K]^-1
([C1]-[C2]) * [K]^-1 = [M1] - [M2]
THEREFORE [K] = ([M1] - [M2])^-1 * ([C1]-[C2])
THEN [b] = [C1].row0 - [M1][K].row0

Code

platform: windows 10
environment: python 3.7.0

Warning

I typically write the code for the 3*3 key matrix.
The code is operated on the Ring Z26.
Specific samples is used in the code, you may change it with some file operations to apply it for wider decryption.
Brute-force operations are implemented in my code when it comes to matrix calculations. If you hope to apply it in wider situations, you should optimize it.
Remember, Hill Cipher is absolutely insecure that we should avoid using it.

Here comes the code:

Skeleton:

def hack_hill_key(plaintext, ciphertext): 
    pmat1, pmat2 = make_matrix(plaintext)
    cmat1, cmat2 = make_matrix(ciphertext)
    p_diff = []
    c_diff = []
    for i in range(len(pmat1)): 
        p_diff.append((pmat1[i] - pmat2[i]) % 26)
        c_diff.append((cmat1[i] - cmat2[i]) % 26)
    p_diff_inv = invert_matrix_26(p_diff)
    key_mat = multi_mat(p_diff_inv, c_diff)
    print('The key matrix L is:\n', key_mat)
    b_mat = []
    temp = multi_mat(pmat1, key_mat)[:3]
    b_mat.append((cmat1[0] - temp[0]) % 26) #b0
    b_mat.append((cmat1[1] - temp[1]) % 26) #b1
    b_mat.append((cmat1[2] - temp[2]) % 26) #b2
    print('The key vector b is: ', b_mat)

if __name__ == '__main__': 
    plaintext = 'adisplayedequation'
    ciphertext = 'DSRMSIOPLXLJBZULLM'
    hack_hill_key(plaintext,ciphertext)

Matrix operation functions:

def make_matrix(text): 
    num = []
    for ch in text: 
        # 0-25 for a-z:
        if ord(ch) > 96:
            num.append(ord(ch)-97) 
        else: 
            num.append(ord(ch)-65)
    mat1 = num[:9]
    mat2 = num[9:]
    return mat1, mat2

def invert_matrix_26(k): 
    # calculat det:
    det = (k[0]*(k[4]*k[8]-k[5]*k[7]) - k[1]*(k[3]*k[8]-k[5]*k[6]) + k[2]*(k[3]*k[7]-k[4]*k[6])) % 26
    # get det^-1 in Z26:
    dp = {1:1,3:9,5:21,7:15,9:3,11:19,15:7,17:23,19:11,21:5,23:17,25:25}
    det_inverse = dp[det]
    # calculate all det(A*) and for a k_inverse: 
    k_inverse = []
    k_inverse.append((k[4]*k[8] - k[5]*k[7]) * det_inverse % 26) 
    k_inverse.append((k[2]*k[7] - k[1]*k[8]) * det_inverse % 26) 
    k_inverse.append((k[1]*k[5] - k[2]*k[4]) * det_inverse % 26) 
    k_inverse.append((k[5]*k[6] - k[3]*k[8]) * det_inverse % 26) 
    k_inverse.append((k[0]*k[8] - k[2]*k[6]) * det_inverse % 26) 
    k_inverse.append((k[2]*k[3] - k[0]*k[5]) * det_inverse % 26) 
    k_inverse.append((k[3]*k[7] - k[4]*k[6]) * det_inverse % 26) 
    k_inverse.append((k[1]*k[6] - k[0]*k[7]) * det_inverse % 26) 
    k_inverse.append((k[0]*k[4] - k[1]*k[3]) * det_inverse % 26) 
    return k_inverse
    
def multi_mat(a,b): 
    # matrix a * matrix b
    # given 2 lists, return a list:
    multi = []
    multi.append((a[0]*b[0] + a[1] * b[3] + a[2] * b[6]) % 26)
    multi.append((a[0]*b[1] + a[1] * b[4] + a[2] * b[7]) % 26)
    multi.append((a[0]*b[2] + a[1] * b[5] + a[2] * b[8]) % 26)
    multi.append((a[3]*b[0] + a[4] * b[3] + a[5] * b[6]) % 26)
    multi.append((a[3]*b[1] + a[4] * b[4] + a[5] * b[7]) % 26)
    multi.append((a[3]*b[2] + a[4] * b[5] + a[5] * b[8]) % 26)
    multi.append((a[6]*b[0] + a[7] * b[3] + a[8] * b[6]) % 26)
    multi.append((a[6]*b[1] + a[7] * b[4] + a[8] * b[7]) % 26)
    multi.append((a[6]*b[2] + a[7] * b[5] + a[8] * b[8]) % 26)
    return multi

Answer of the code:

The key matrix L is: [3, 6, 4, 5, 15, 18, 17, 8, 5]
The key vector b is: [8, 13, 1]