#MD5 加密算法及实现

MD5 Algorithm Implement

我算法的主要思想来自于 Wikipedia, RFC 1321 以及老师关于MD5的PPT

1. MD5 概述

  • MD5 即 Message-Digest Algorithm 5 (信息-摘要算法 5)

  • MD4 (1990)、MD5(1992, RFC 1321) 作者 Ron Rivest,是广泛使用

    的散列算法,经常用于确保信息传输的完整性和一致性

  • MD5 使用 little-endian,输入任意不定长度信息,以512位长进
    行分组,生成四个32位数据,最后联合起来输出固定128位长
    的信息摘要。

  • MD5 算法的基本过程为:求余、取余、调整长度、与链接变量
    进行循环运算、得出结果。

  • MD5 不是足够安全的

    • ​HansDobbertin在1996年找到了两个不同的512-bit块,它们 在 MD5 计算下产生相同的 hash 值。

    • 至今还没有真正找到两个不同的消息,它们的MD5的hash
      值相等

2. MD5 算法(算法流程来自RFC 1321)

2.1 input

  • 假设我们有一个长度为b 个bit的输入信息,记作:

    m0,m1,m2mi+1

  • 这个输入信息的长度没有限制

2.2 Step 1 : Decode and Append Padding Bits

The message is “padded” (extended) so that its length (in bits) is
congruent to 448, modulo 512. That is, the message is extended so
that it is just 64 bits shy of being a multiple of 512 bits long.
Padding is always performed, even if the length of the message is
already congruent to 448, modulo 512.
Padding is performed as follows: a single “1” bit is appended to the
message, and then “0” bits are appended so that the length in bits of
the padded message becomes congruent to 448, modulo 512. In all, at
least one bit and at most 512 bits are appended.

  • 在这一步中,我们将输入的char数组转换成二进制串(在实现中用int数组存储),并且用1和0做填充,使得length % 512 == 448 bits, 即length % 64 == 56 bytes ,具体为一个1,后面跟着许多0。

  • *这一步至少需要做一次

  • 在实际RFC 1321实现中, 这部分的处理方法并不是全部一起做的,而是先对每512个bit做Step 3、4,最后如果遇到不足512bit的,再进行Step2,然后再Step 3、4, 就完成了这个过程。并不是线性的按照Step 1 -> Step 2 -> Step 3 ->Step 4这样的顺序来完成的,故此比较难以理解

2.3 Step 2 : Append Length

​ A 64-bit representation of b (the length of the message before the
padding bits were added) is appended to the result of the previous
step. In the unlikely event that b is greater than 2^64, then only
the low-order 64 bits of b are used. (These bits are appended as two
32-bit words and appended low-order word first in accordance with the
previous conventions.)

​ At this point the resulting message (after padding with bits and with
b) has a length that is an exact multiple of 512 bits. Equivalently,
this message has a length that is an exact multiple of 16 (32-bit)
words. Let M[0 … N-1] denote the words of the resulting message,
where N is a multiple of 16.

  • 这部分是补齐操作,对于Step 2做了填充后的二进制串,需要在后面补原来输入字符串的后64bits的字符,使得最终长度为512bits的整数倍,即length % 512 == 0

    WX20171020-113950@2x

  • 进行补齐后,就和之前Step 1所截取的部分具有一样的512bits了,因此可以继续进行Step 3、4,完成信息的压缩。

2.4 Step 3 : Initialize MD Buffer

A four-word buffer (A,B,C,D) is used to compute the message digest. Here each of A, B, C, D is a >32-bit register. These registers are initialized to the following values in hexadecimal, low-order >bytes first):

​ word A: 01 23 45 67

​ word B: 89 ab cd ef

​ word C: fe dc ba 98

​ word D: 76 54 32 10

#define A 0x67452301
#define B 0xEFCDAB89
#define C 0x98BADCFE
#define D 0x10325476
  • 这个定义的是四个buffer的初始值,在之后每一轮的信息压缩中update的时候,会进行压缩,并把压缩的结果加回到buffer中去。具体的描述可以看第三部分的代码解释。

2.5 Step 4 : Process Message in 16-Word Blocks

We first define four auxiliary functions that each take as input three 32-bit words and produce as output one 32-bit word.

        F(X,Y,Z) = XY v not(X) Z
          G(X,Y,Z) = XZ v Y not(Z)
          H(X,Y,Z) = X xor Y xor Z
          I(X,Y,Z) = Y xor (X v not(Z))

​ In each bit position F acts as a conditional: if X then Y else Z.
The function F could have been defined using + instead of v since XY
and not(X)Z will never have 1’s in the same bit position.) It is
interesting to note that if the bits of X, Y, and Z are independent
and unbiased, the each bit of F(X,Y,Z) will be independent and
unbiased.

​ The functions G, H, and I are similar to the function F, in that they
act in “bitwise parallel” to produce their output from the bits of X,
Y, and Z, in such a manner that if the corresponding bits of X, Y,
and Z are independent and unbiased, then each bit of G(X,Y,Z),
H(X,Y,Z), and I(X,Y,Z) will be independent and unbiased. Note that
the function H is the bit-wise “xor” or “parity” function of its
inputs.

​ This step uses a 64-element table T[1 … 64] constructed from the
sine function. Let T[i] denote the i-th element of the table, which
is equal to the integer part of 4294967296 times abs(sin(i)), where i
is in radians. The elements of the table are given in the appendix.

  • 这一部分是最重要的部分了,主要的压缩逻辑都在这里,关键的T表和左移的次数,都是给定的Magic number,这部分可能才是算法中最重要的部分之一。实际上很好理解,就是做四轮,每轮采用FF,GG,HH,II之一的一个函数,然后将解码后的二进制串,采用给定的参数(T表和左移的常数),进行移位并赋值即可。

WX20171020-114107@2x

WX20171020-114121@2x

WX20171020-114131@2x

3. 部分代码解释

3.1 开始前的变量准备

  • 值得一提的是,由于MD5算法主要处理都在二进制部分,所以符号对其来说是无意义的,因此具体实现的时候,都采用无符号的变量。其中unsigned int 类型只用作存储二进制串

  • 这部分的变量的作用是沿用RFC 1321的,但是其中有些比较难以理解,值得一提的就是count[2],由于C90没有64bit的变量类型,所以需要采用一个由两个bit32组成的数组,一个表示低32位,一个表示高32位。

    #define A 0x67452301
    #define B 0xEFCDAB89
    #define C 0x98BADCFE
    #define D 0x10325476

    // four register block a b c d
    unsigned int _reg[4];
    // the 64 bits of origin meassage. Separate to high-32 bits and low-32 bits
    unsigned int _count[2];
    // the 512 bits input buffer 
    unsigned char _buffer[64];
    // the 128 bits MD5 message after final 
    unsigned char _digest[16];
    // the 512 bits to pad into the tail of meassage 
    const unsigned char padding[64] = {0x80};
  • 解码和编码是为了在char 和 int之间做转换,其实也就是在二进制形式,和ASCII码之间做转换
    /**
     * @init some private variables 
    */
    void init() {
        _count[0] = _count[1] = 0;
        _reg[0] = A;
        _reg[1] = B;
        _reg[2] = C;
        _reg[3] = D;
        memset(_buffer,0,sizeof(unsigned char)*64);
    } 
    /**
     * @Convert 1 int to 4 char
     * @param {output} the char stream after convert
     * @param {input} the input int array
     * @param {length} the length of input array
    */
    void encode(unsigned char* output, const unsigned int* input, const unsigned int length) {
        for (int i = 0, j = 0; j < length; i++, j += 4) {
            output[j] = (unsigned char)(input[i] & 0xff);
            output[j + 1] = (unsigned char)((input[i] >> 8) & 0xff);
            output[j + 2] = (unsigned char)((input[i] >> 16) & 0xff);
            output[j + 3] = (unsigned char)((input[i] >> 24) & 0xff);
        }
    }
    /**
     * @Decodes input (unsigned char) into output (unsigned int). Assumes len is
     *   a multiple of 4.
     * @param {output} is the output unsigned int stream
     * @param {message} is the input message 
     * @param {length} is the length of message
    */
    void decode(unsigned int* output, const unsigned char* message, const unsigned int length) {
        for (int i = 0, j = 0; j < length; i++ , j += 4) {
            output[i] = ( (unsigned int)message[j]) | ( (unsigned int)message[j + 1] << 8)
                        | ( (unsigned int)message[j + 2] << 16) | ( (unsigned int)message[j + 3] << 24);  
        }
    }

3.2 关键的压缩变换函数,以及每部分的压缩

  • 前面已经说过,由于RFC 1321实现的时候使用的方法是先填充然后压缩,然后再补齐,再压缩,所以顺序不太一样。故此先看关键的压缩部分。
  • 一些重要的函数,由于害怕写错,基本都是照着RFC 1321写的,实际上唯一没有照搬过来的部分,因为对于宏定义函数有些错误的理解,还写错了。导致最后debug找了很久
#define FUNC_F(x,y,z) (((x) & (y)) | ((~(x)) & (z)))
#define FUNC_G(x,y,z) (((x) & (z)) | ((y) & (~(z))))
#define FUNC_H(x,y,z) ((x) ^ (y) ^ (z) )
#define FUNC_I(x,y,z) ((y) ^ ((x) | (~(z)) ) )
#define ROTATE_LEFT(x, n) (((x) << (n)) | ((x) >> (32 - (n))))

#define FF(a, b, c, d, x, s, ac) { \
    (a) += FUNC_F ((b), (c), (d)) + (x) + (unsigned int)(ac); \
    (a) = ROTATE_LEFT ((a), (s)); \
    (a) += (b); \
}
#define GG(a, b, c, d, x, s, ac) { \
    (a) += FUNC_G ((b), (c), (d)) + (x) + (unsigned int)(ac); \
    (a) = ROTATE_LEFT ((a), (s)); \
    (a) += (b); \
}
#define HH(a, b, c, d, x, s, ac) { \
    (a) += FUNC_H ((b), (c), (d)) + (x) + (unsigned int)(ac); \
    (a) = ROTATE_LEFT ((a), (s)); \
    (a) += (b); \
}
#define II(a, b, c, d, x, s, ac) { \
    (a) += FUNC_I ((b), (c), (d)) + (x) + (unsigned int)(ac); \
    (a) = ROTATE_LEFT ((a), (s)); \
    (a) += (b); \
}

#define S11 7
#define S12 12
#define S13 17
#define S14 22
#define S21 5
#define S22 9
#define S23 14
#define S24 20
#define S31 4
#define S32 11
#define S33 16
#define S34 23
#define S41 6
#define S42 10
#define S43 15
#define S44 21
    /**
     * @H_md5 four round digest algorithm.
     * @param {input} is the 512 bits after padding
    */
    void transform(const unsigned char input[64]) {
        unsigned int a = _reg[0], b = _reg[1], c = _reg[2], d = _reg[3], x[16];
        decode(x, input, 64);
        /* Round 1 */
        FF (a, b, c, d, x[ 0], S11, 0xd76aa478); /* 1 */
        FF (d, a, b, c, x[ 1], S12, 0xe8c7b756); /* 2 */
        FF (c, d, a, b, x[ 2], S13, 0x242070db); /* 3 */
        FF (b, c, d, a, x[ 3], S14, 0xc1bdceee); /* 4 */
        FF (a, b, c, d, x[ 4], S11, 0xf57c0faf); /* 5 */
        FF (d, a, b, c, x[ 5], S12, 0x4787c62a); /* 6 */
        FF (c, d, a, b, x[ 6], S13, 0xa8304613); /* 7 */
        FF (b, c, d, a, x[ 7], S14, 0xfd469501); /* 8 */
        FF (a, b, c, d, x[ 8], S11, 0x698098d8); /* 9 */
        FF (d, a, b, c, x[ 9], S12, 0x8b44f7af); /* 10 */
        FF (c, d, a, b, x[10], S13, 0xffff5bb1); /* 11 */
        FF (b, c, d, a, x[11], S14, 0x895cd7be); /* 12 */
        FF (a, b, c, d, x[12], S11, 0x6b901122); /* 13 */
        FF (d, a, b, c, x[13], S12, 0xfd987193); /* 14 */
        FF (c, d, a, b, x[14], S13, 0xa679438e); /* 15 */
        FF (b, c, d, a, x[15], S14, 0x49b40821); /* 16 */

        /* Round 2 */
        GG (a, b, c, d, x[ 1], S21, 0xf61e2562); /* 17 */
        GG (d, a, b, c, x[ 6], S22, 0xc040b340); /* 18 */
        GG (c, d, a, b, x[11], S23, 0x265e5a51); /* 19 */
        GG (b, c, d, a, x[ 0], S24, 0xe9b6c7aa); /* 20 */
        GG (a, b, c, d, x[ 5], S21, 0xd62f105d); /* 21 */
        GG (d, a, b, c, x[10], S22,  0x2441453); /* 22 */
        GG (c, d, a, b, x[15], S23, 0xd8a1e681); /* 23 */
        GG (b, c, d, a, x[ 4], S24, 0xe7d3fbc8); /* 24 */
        GG (a, b, c, d, x[ 9], S21, 0x21e1cde6); /* 25 */
        GG (d, a, b, c, x[14], S22, 0xc33707d6); /* 26 */
        GG (c, d, a, b, x[ 3], S23, 0xf4d50d87); /* 27 */
        GG (b, c, d, a, x[ 8], S24, 0x455a14ed); /* 28 */
        GG (a, b, c, d, x[13], S21, 0xa9e3e905); /* 29 */
        GG (d, a, b, c, x[ 2], S22, 0xfcefa3f8); /* 30 */
        GG (c, d, a, b, x[ 7], S23, 0x676f02d9); /* 31 */
        GG (b, c, d, a, x[12], S24, 0x8d2a4c8a); /* 32 */

        /* Round 3 */
        HH (a, b, c, d, x[ 5], S31, 0xfffa3942); /* 33 */
        HH (d, a, b, c, x[ 8], S32, 0x8771f681); /* 34 */
        HH (c, d, a, b, x[11], S33, 0x6d9d6122); /* 35 */
        HH (b, c, d, a, x[14], S34, 0xfde5380c); /* 36 */
        HH (a, b, c, d, x[ 1], S31, 0xa4beea44); /* 37 */
        HH (d, a, b, c, x[ 4], S32, 0x4bdecfa9); /* 38 */
        HH (c, d, a, b, x[ 7], S33, 0xf6bb4b60); /* 39 */
        HH (b, c, d, a, x[10], S34, 0xbebfbc70); /* 40 */
        HH (a, b, c, d, x[13], S31, 0x289b7ec6); /* 41 */
        HH (d, a, b, c, x[ 0], S32, 0xeaa127fa); /* 42 */
        HH (c, d, a, b, x[ 3], S33, 0xd4ef3085); /* 43 */
        HH (b, c, d, a, x[ 6], S34,  0x4881d05); /* 44 */
        HH (a, b, c, d, x[ 9], S31, 0xd9d4d039); /* 45 */
        HH (d, a, b, c, x[12], S32, 0xe6db99e5); /* 46 */
        HH (c, d, a, b, x[15], S33, 0x1fa27cf8); /* 47 */
        HH (b, c, d, a, x[ 2], S34, 0xc4ac5665); /* 48 */

        /* Round 4 */
        II (a, b, c, d, x[ 0], S41, 0xf4292244); /* 49 */
        II (d, a, b, c, x[ 7], S42, 0x432aff97); /* 50 */
        II (c, d, a, b, x[14], S43, 0xab9423a7); /* 51 */
        II (b, c, d, a, x[ 5], S44, 0xfc93a039); /* 52 */
        II (a, b, c, d, x[12], S41, 0x655b59c3); /* 53 */
        II (d, a, b, c, x[ 3], S42, 0x8f0ccc92); /* 54 */
        II (c, d, a, b, x[10], S43, 0xffeff47d); /* 55 */
        II (b, c, d, a, x[ 1], S44, 0x85845dd1); /* 56 */
        II (a, b, c, d, x[ 8], S41, 0x6fa87e4f); /* 57 */
        II (d, a, b, c, x[15], S42, 0xfe2ce6e0); /* 58 */
        II (c, d, a, b, x[ 6], S43, 0xa3014314); /* 59 */
        II (b, c, d, a, x[13], S44, 0x4e0811a1); /* 60 */
        II (a, b, c, d, x[ 4], S41, 0xf7537e82); /* 61 */
        II (d, a, b, c, x[11], S42, 0xbd3af235); /* 62 */
        II (c, d, a, b, x[ 2], S43, 0x2ad7d2bb); /* 63 */
        II (b, c, d, a, x[ 9], S44, 0xeb86d391); /* 64 */

        _reg[0] += a;
        _reg[1] += b;
        _reg[2] += c;
        _reg[3] += d;
    }
  • ==注意:Wiki中给出了另一种实现,由于此处有大量的magic number,其实可以存在数组中再进行for 循环的,但是考虑到害怕Wiki给出的T table 以及shift的值和RFC的有出入,最后保险起见还是选取了RFC的实现,虽然看起来会比较冗余,下面也给出Wiki的实现:==

  • 变换矩阵

    //r specifies the per-round shift amounts     
    const int r[64] = {
            // r[0:15]
            7, 12, 17, 22,  7, 12, 17, 22,  7, 12, 17, 22,  7, 12, 17, 22,
            // r[16:31]
            5,  9, 14, 20,  5,  9, 14, 20,  5,  9, 14, 20,  5,  9, 14, 20,
            // r[32:47]
            4, 11, 16, 23,  4, 11, 16, 23,  4, 11, 16, 23,  4, 11, 16, 23,
            // r[48:63]
            6, 10, 15, 21,  6, 10, 15, 21,  6, 10, 15, 21,  6, 10, 15, 21
    };

    //  T table 
    const unsigned int T[64] = { 0xd76aa478, 0xe8c7b756, 0x242070db, 0xc1bdceee,
                                 0xf57c0faf, 0x4787c62a, 0xa8304613, 0xfd469501,
                                 0x698098d8, 0x8b44f7af, 0xffff5bb1, 0x895cd7be,
                                 0x6b901122, 0xfd987193, 0xa679438e, 0x49b40821,
                                 0xf61e2562, 0xc040b340, 0x265e5a51, 0xe9b6c7aa,
                                 0xd62f105d, 0x02441453, 0xd8a1e681, 0xe7d3fbc8,
                                 0x21e1cde6, 0xc33707d6, 0xf4d50d87, 0x455a14ed,
                                 0xa9e3e905, 0xfcefa3f8, 0x676f02d9, 0x8d2a4c8a,
                                 0xfffa3942, 0x8771f681, 0x6d9d6122, 0xfde5380c,
                                 0xa4beea44, 0x4bdecfa9, 0xf6bb4b60, 0xbebfbc70,
                                 0x289b7ec6, 0xeaa127fa, 0xd4ef3085, 0x04881d05,
                                 0xd9d4d039, 0xe6db99e5, 0x1fa27cf8, 0xc4ac5665,
                                 0xf4292244, 0x432aff97, 0xab9423a7, 0xfc93a039,
                                 0x655b59c3, 0x8f0ccc92, 0xffeff47d, 0x85845dd1,
                                 0x6fa87e4f, 0xfe2ce6e0, 0xa3014314, 0x4e0811a1,
                                 0xf7537e82, 0xbd3af235, 0x2ad7d2bb, 0xeb86d391
    };
  • Wiki 上的伪代码
//Process the message in successive 512-bit chunks:
for each 512-bit chunk of message
    break chunk into sixteen 32-bit little-endian words w[i], 0 ≤ i ≤ 15

    //Initialize hash value for this chunk:
    var int a := h0
    var int b := h1
    var int c := h2
    var int d := h3

    //Main loop:
    for i from 0 to 63
        if 0 ≤ i ≤ 15 then
            f := (b and c) or ((not b) and d)
            g := i
        else if 16 ≤ i ≤ 31
            f := (d and b) or ((not d) and c)
            g := (5×i + 1) mod 16
        else if 32 ≤ i ≤ 47
            f := b xor c xor d
            g := (3×i + 5) mod 16
        else if 48 ≤ i ≤ 63
            f := c xor (b or (not d))
            g := (7×i) mod 16

        temp := d
        d := c
        c := b
        b := leftrotate((a + f + k[i] + w[g]),r[i]) + b
        a := temp
    Next i
    //Add this chunk's hash to result so far:
    h0 := h0 + a
    h1 := h1 + b 
    h2 := h2 + c
    h3 := h3 + d
End ForEach

3.3 Step 2、Step 3 的update和final 函数

  • 之前已经多次解释过了RFC 1321的实现导致的问题,这里就不再次赘述了,这部分就是之前说的,完成填充,和补齐操作,并且分别调用压缩的过程了。
    /**
     * @Step 2 and 3, padding the input message
     * @param {message} the input message
     * @param {length} the length(byte) of input message
    */
    void update(const unsigned char* message, unsigned int length) {
        // Computer the number of bytes mod 64
        unsigned int index = (_count[0] / 8)% 64;
        if ((_count[0] += (length << 3)) < (length << 3)) _count[1]++;
        //_count[1] get the bit-endian of lenth. it's length << 3 then length >>32
        _count[1] += (length >> 29);

        // So here is the place we start to fill in 0s
        unsigned int startPos = 64 - index;
        unsigned int i;

        if (length >= startPos) {
            memcpy(&_buffer[index] , message , startPos);
            transform(_buffer);
            for (i = startPos ; i + 64 <= length ; i += 64) transform(message + i);
            index = 0;
        } else {
            i = 0;
        }
        memcpy(&_buffer[index] , message + i , length - i);
    }
    /**
     * @MD5 finalization. Ends an MD5 message-digest operation,
     * writing the message digest and zeroizing the context.
    */
    void final( ) {
        unsigned char bits[8];
        unsigned int index, padLen;

        // Save number of bits
        encode(bits, _count, 8);

        // Pad out to 56 mod 64.
        index = (unsigned int)((_count[0] >> 3) & 0x3f);
        padLen = (index < 56) ? (56 - index) : (120 - index);

        update(padding, padLen);
        // Append length (before padding)
        update(bits, 8);

        // Store state in digest
        encode(_digest, _reg, 16);
    }

4. MD5 效果展示

  • 标准结果(来自RFC 1321

RFC 1321

  • 我的测试结果

WX20171020-102107@2x

5. 总结与遇到的问题

遇到的问题

​ MD5算法可以说是比较好理解的,当然这是不深究其那些转换表为什么这样设置,为什么要padding成448bits的情况下来说的。但是实际看RFC 1321的实现和Wiki之类的,会看不懂,因为里面的实现其实很难理解,比如好好的mod 64, 非要弄成 length << 3 & 0x3f这么难以理解的表示。另外那个count[1] += length >> 29 我也思考了好久,都不知道为什么右移29这么奇怪的位数。最后课下问老师才知道是要先左移3位,再右移32位,最后合起来右移29位。总之很多实现的很难以理解,RFC 1321也没有在注释中做许多解释。另外c语言实现的步骤和算法描述又不太一样,导致看了两天之后其实都还是懵逼的。后面老师再课上又做了一次解释,我才恍然大悟,可以说是非常难以想到的,老师也说他们曾经是研究过这个实现才理解的,所以这个实现的部分理解起来确实有很大的难度。

​ 我自己实现的时候,处理理解的部分,最大的一个问题,就在于各种类型的转换了,还有宏定义函数,最后一个部分就因为之前的宏定义函数表达有误,所以出错了,debug相当久才找了出来。

总结

​ MD5算法和DES算法对着PPT或者RFC、Wiki实现并不难(相比起来由于MD5基本都是二进制的缘故,我觉得实现起来还更难),但是其实很多个中奥妙也不过是囫囵吞枣,要做到完全理解,不是几天几周能完成的。另外不得不说的就是,RFC 1321的实现实在是太难理解了,我在遇到的问题中已经详细举例了,实际上基于C11,完全可以给出更好的实现的,但是稳妥起见,我还是选择了RFC 1321的实现,连Wiki的实现都没有采用,但是如果有机会我会尝试再按照Wiki的重新实现一遍的。

​ 另外就是在Github看见了一下MD5的实现,发现人家的注释写的是真的很漂亮,所以这次也就仿照之前大二实训的格式写注释,感觉这样子写起来会容易理解的多,而且也更加一目了然。还有就是关于宏定义函数的理解也更深了,这也算是语言上的一些收获了吧。

​ MD5算法之前在web课程做用户登录的密码加密过程就已经使用过一次了,当时就觉得这种加密非常的神奇,这次能有机会自己实现,感觉收获也是很大的。

6. 附件:源代码

  • Md5.hpp
#ifndef _MD5_HPP_
#define _MD5_HPP_

#include <cstdio>
#include <iostream>
using namespace std;

#define FUNC_F(x,y,z) (((x) & (y)) | ((~(x)) & (z)))
#define FUNC_G(x,y,z) (((x) & (z)) | ((y) & (~(z))))
#define FUNC_H(x,y,z) ((x) ^ (y) ^ (z) )
#define FUNC_I(x,y,z) ((y) ^ ((x) | (~(z)) ) )
#define ROTATE_LEFT(x, n) (((x) << (n)) | ((x) >> (32 - (n))))

#define FF(a, b, c, d, x, s, ac) { \
    (a) += FUNC_F ((b), (c), (d)) + (x) + (unsigned int)(ac); \
    (a) = ROTATE_LEFT ((a), (s)); \
    (a) += (b); \
}
#define GG(a, b, c, d, x, s, ac) { \
    (a) += FUNC_G ((b), (c), (d)) + (x) + (unsigned int)(ac); \
    (a) = ROTATE_LEFT ((a), (s)); \
    (a) += (b); \
}
#define HH(a, b, c, d, x, s, ac) { \
    (a) += FUNC_H ((b), (c), (d)) + (x) + (unsigned int)(ac); \
    (a) = ROTATE_LEFT ((a), (s)); \
    (a) += (b); \
}
#define II(a, b, c, d, x, s, ac) { \
    (a) += FUNC_I ((b), (c), (d)) + (x) + (unsigned int)(ac); \
    (a) = ROTATE_LEFT ((a), (s)); \
    (a) += (b); \
}
#define A 0x67452301
#define B 0xEFCDAB89
#define C 0x98BADCFE
#define D 0x10325476

#define S11 7
#define S12 12
#define S13 17
#define S14 22
#define S21 5
#define S22 9
#define S23 14
#define S24 20
#define S31 4
#define S32 11
#define S33 16
#define S34 23
#define S41 6
#define S42 10
#define S43 15
#define S44 21


class MD5 {
private:
    // four register block a b c d
    unsigned int _reg[4];
    // the 64 bits of origin message. Separate to high-32 bits and low-32 bits
    unsigned int _count[2];
    // the 512 bits input buffer 
    unsigned char _buffer[64];
    // the 128 bits MD5 message after final 
    unsigned char _digest[16];
    //r specifies the per-round shift amounts     
    const int r[64] = {
            // r[0:15]
            7, 12, 17, 22,  7, 12, 17, 22,  7, 12, 17, 22,  7, 12, 17, 22,
            // r[16:31]
            5,  9, 14, 20,  5,  9, 14, 20,  5,  9, 14, 20,  5,  9, 14, 20,
            // r[32:47]
            4, 11, 16, 23,  4, 11, 16, 23,  4, 11, 16, 23,  4, 11, 16, 23,
            // r[48:63]
            6, 10, 15, 21,  6, 10, 15, 21,  6, 10, 15, 21,  6, 10, 15, 21
    };

    //  T table 
    const unsigned int T[64] = { 0xd76aa478, 0xe8c7b756, 0x242070db, 0xc1bdceee,
                                 0xf57c0faf, 0x4787c62a, 0xa8304613, 0xfd469501,
                                 0x698098d8, 0x8b44f7af, 0xffff5bb1, 0x895cd7be,
                                 0x6b901122, 0xfd987193, 0xa679438e, 0x49b40821,
                                 0xf61e2562, 0xc040b340, 0x265e5a51, 0xe9b6c7aa,
                                 0xd62f105d, 0x02441453, 0xd8a1e681, 0xe7d3fbc8,
                                 0x21e1cde6, 0xc33707d6, 0xf4d50d87, 0x455a14ed,
                                 0xa9e3e905, 0xfcefa3f8, 0x676f02d9, 0x8d2a4c8a,
                                 0xfffa3942, 0x8771f681, 0x6d9d6122, 0xfde5380c,
                                 0xa4beea44, 0x4bdecfa9, 0xf6bb4b60, 0xbebfbc70,
                                 0x289b7ec6, 0xeaa127fa, 0xd4ef3085, 0x04881d05,
                                 0xd9d4d039, 0xe6db99e5, 0x1fa27cf8, 0xc4ac5665,
                                 0xf4292244, 0x432aff97, 0xab9423a7, 0xfc93a039,
                                 0x655b59c3, 0x8f0ccc92, 0xffeff47d, 0x85845dd1,
                                 0x6fa87e4f, 0xfe2ce6e0, 0xa3014314, 0x4e0811a1,
                                 0xf7537e82, 0xbd3af235, 0x2ad7d2bb, 0xeb86d391
    };

    // the 512 bits to pad into the tail of message 
    const unsigned char padding[64] = {0x80};

    /**
     * @init some private variables 
    */
    void init() {
        _count[0] = _count[1] = 0;
        _reg[0] = A;
        _reg[1] = B;
        _reg[2] = C;
        _reg[3] = D;
        memset(_buffer,0,sizeof(unsigned char)*64);
    }
    /**
     * @Convert 1 int to 4 char
     * @param {output} the char stream after convert
     * @param {input} the input int array
     * @param {length} the length of input array
    */
    void encode(unsigned char* output, const unsigned int* input, const unsigned int length) {
        for (int i = 0, j = 0; j < length; i++, j += 4) {
            output[j] = (unsigned char)(input[i] & 0xff);
            output[j + 1] = (unsigned char)((input[i] >> 8) & 0xff);
            output[j + 2] = (unsigned char)((input[i] >> 16) & 0xff);
            output[j + 3] = (unsigned char)((input[i] >> 24) & 0xff);
        }
    }
    /**
     * @Decodes input (unsigned char) into output (unsigned int). Assumes len is
     *   a multiple of 4.
     * @param {output} is the output unsigned int stream
     * @param {message} is the input message 
     * @param {length} is the length of message
    */
    void decode(unsigned int* output, const unsigned char* message, const unsigned int length) {
        for (int i = 0, j = 0; j < length; i++ , j += 4) {
            output[i] = ( (unsigned int)message[j]) | ( (unsigned int)message[j + 1] << 8)
                        | ( (unsigned int)message[j + 2] << 16) | ( (unsigned int)message[j + 3] << 24);  
        }
    }
    /**
     * @Step 2 and 3, padding the input message
     * @param {message} the input message
     * @param {length} the length(byte) of input message
    */
    void update(const unsigned char* message, unsigned int length) {
        // Computer the number of bytes mod 64
        unsigned int index = (_count[0] / 8)% 64;
        if ((_count[0] += (length << 3)) < (length << 3)) _count[1]++;
        //_count[1] get the bit-endian of lenth. it's length << 3 then length >>32
        _count[1] += (length >> 29);

        // So here is the place we start to fill in 0s
        unsigned int startPos = 64 - index;
        unsigned int i;

        if (length >= startPos) {
            memcpy(&_buffer[index] , message , startPos);
            transform(_buffer);
            for (i = startPos ; i + 64 <= length ; i += 64) transform(message + i);
            index = 0;
        } else {
            i = 0;
        }
        memcpy(&_buffer[index] , message + i , length - i);
    }
    /**
     * @MD5 finalization. Ends an MD5 message-digest operation,
     * writing the message digest and zeroizing the context.
    */
    void final( ) {
        unsigned char bits[8];
        unsigned int index, padLen;

        // Save number of bits
        encode(bits, _count, 8);

        // Pad out to 56 mod 64.
        index = (unsigned int)((_count[0] >> 3) & 0x3f);
        padLen = (index < 56) ? (56 - index) : (120 - index);

        update(padding, padLen);
        // Append length (before padding)
        update(bits, 8);

        // Store state in digest
        encode(_digest, _reg, 16);
    }
    /**
     * @H_md5 four round digest algorithm.
     * @param {input} is the 512 bits after padding
    */
    void transform(const unsigned char input[64]) {
        unsigned int a = _reg[0], b = _reg[1], c = _reg[2], d = _reg[3], x[16];
        decode(x, input, 64);
        /* Round 1 */
        FF (a, b, c, d, x[ 0], S11, 0xd76aa478); /* 1 */
        FF (d, a, b, c, x[ 1], S12, 0xe8c7b756); /* 2 */
        FF (c, d, a, b, x[ 2], S13, 0x242070db); /* 3 */
        FF (b, c, d, a, x[ 3], S14, 0xc1bdceee); /* 4 */
        FF (a, b, c, d, x[ 4], S11, 0xf57c0faf); /* 5 */
        FF (d, a, b, c, x[ 5], S12, 0x4787c62a); /* 6 */
        FF (c, d, a, b, x[ 6], S13, 0xa8304613); /* 7 */
        FF (b, c, d, a, x[ 7], S14, 0xfd469501); /* 8 */
        FF (a, b, c, d, x[ 8], S11, 0x698098d8); /* 9 */
        FF (d, a, b, c, x[ 9], S12, 0x8b44f7af); /* 10 */
        FF (c, d, a, b, x[10], S13, 0xffff5bb1); /* 11 */
        FF (b, c, d, a, x[11], S14, 0x895cd7be); /* 12 */
        FF (a, b, c, d, x[12], S11, 0x6b901122); /* 13 */
        FF (d, a, b, c, x[13], S12, 0xfd987193); /* 14 */
        FF (c, d, a, b, x[14], S13, 0xa679438e); /* 15 */
        FF (b, c, d, a, x[15], S14, 0x49b40821); /* 16 */

        /* Round 2 */
        GG (a, b, c, d, x[ 1], S21, 0xf61e2562); /* 17 */
        GG (d, a, b, c, x[ 6], S22, 0xc040b340); /* 18 */
        GG (c, d, a, b, x[11], S23, 0x265e5a51); /* 19 */
        GG (b, c, d, a, x[ 0], S24, 0xe9b6c7aa); /* 20 */
        GG (a, b, c, d, x[ 5], S21, 0xd62f105d); /* 21 */
        GG (d, a, b, c, x[10], S22,  0x2441453); /* 22 */
        GG (c, d, a, b, x[15], S23, 0xd8a1e681); /* 23 */
        GG (b, c, d, a, x[ 4], S24, 0xe7d3fbc8); /* 24 */
        GG (a, b, c, d, x[ 9], S21, 0x21e1cde6); /* 25 */
        GG (d, a, b, c, x[14], S22, 0xc33707d6); /* 26 */
        GG (c, d, a, b, x[ 3], S23, 0xf4d50d87); /* 27 */
        GG (b, c, d, a, x[ 8], S24, 0x455a14ed); /* 28 */
        GG (a, b, c, d, x[13], S21, 0xa9e3e905); /* 29 */
        GG (d, a, b, c, x[ 2], S22, 0xfcefa3f8); /* 30 */
        GG (c, d, a, b, x[ 7], S23, 0x676f02d9); /* 31 */
        GG (b, c, d, a, x[12], S24, 0x8d2a4c8a); /* 32 */

        /* Round 3 */
        HH (a, b, c, d, x[ 5], S31, 0xfffa3942); /* 33 */
        HH (d, a, b, c, x[ 8], S32, 0x8771f681); /* 34 */
        HH (c, d, a, b, x[11], S33, 0x6d9d6122); /* 35 */
        HH (b, c, d, a, x[14], S34, 0xfde5380c); /* 36 */
        HH (a, b, c, d, x[ 1], S31, 0xa4beea44); /* 37 */
        HH (d, a, b, c, x[ 4], S32, 0x4bdecfa9); /* 38 */
        HH (c, d, a, b, x[ 7], S33, 0xf6bb4b60); /* 39 */
        HH (b, c, d, a, x[10], S34, 0xbebfbc70); /* 40 */
        HH (a, b, c, d, x[13], S31, 0x289b7ec6); /* 41 */
        HH (d, a, b, c, x[ 0], S32, 0xeaa127fa); /* 42 */
        HH (c, d, a, b, x[ 3], S33, 0xd4ef3085); /* 43 */
        HH (b, c, d, a, x[ 6], S34,  0x4881d05); /* 44 */
        HH (a, b, c, d, x[ 9], S31, 0xd9d4d039); /* 45 */
        HH (d, a, b, c, x[12], S32, 0xe6db99e5); /* 46 */
        HH (c, d, a, b, x[15], S33, 0x1fa27cf8); /* 47 */
        HH (b, c, d, a, x[ 2], S34, 0xc4ac5665); /* 48 */

        /* Round 4 */
        II (a, b, c, d, x[ 0], S41, 0xf4292244); /* 49 */
        II (d, a, b, c, x[ 7], S42, 0x432aff97); /* 50 */
        II (c, d, a, b, x[14], S43, 0xab9423a7); /* 51 */
        II (b, c, d, a, x[ 5], S44, 0xfc93a039); /* 52 */
        II (a, b, c, d, x[12], S41, 0x655b59c3); /* 53 */
        II (d, a, b, c, x[ 3], S42, 0x8f0ccc92); /* 54 */
        II (c, d, a, b, x[10], S43, 0xffeff47d); /* 55 */
        II (b, c, d, a, x[ 1], S44, 0x85845dd1); /* 56 */
        II (a, b, c, d, x[ 8], S41, 0x6fa87e4f); /* 57 */
        II (d, a, b, c, x[15], S42, 0xfe2ce6e0); /* 58 */
        II (c, d, a, b, x[ 6], S43, 0xa3014314); /* 59 */
        II (b, c, d, a, x[13], S44, 0x4e0811a1); /* 60 */
        II (a, b, c, d, x[ 4], S41, 0xf7537e82); /* 61 */
        II (d, a, b, c, x[11], S42, 0xbd3af235); /* 62 */
        II (c, d, a, b, x[ 2], S43, 0x2ad7d2bb); /* 63 */
        II (b, c, d, a, x[ 9], S44, 0xeb86d391); /* 64 */

        _reg[0] += a;
        _reg[1] += b;
        _reg[2] += c;
        _reg[3] += d;
    }

public:
    MD5() {
        init();
    }
    MD5(const string &s) {
        init();
        update((const unsigned char*)s.c_str(), s.length());
        final();
    }
    void generate(const string & s) {
        init();
        update((const unsigned char*)s.c_str(), s.length());
        final();
    }
    string toString() {
        char hexStr[33];
        for (int i = 0; i < 16; i++) {
            sprintf(hexStr + i*2, "%02x", _digest[i]);
        }
        hexStr[32] = '\0';
        return hexStr;
    }
};

#endif
  • md5.cpp
#include "md5.hpp"

int main() {
    MD5 m;
    string s;
    cout << "Please enter a meassage to be encrypted: ";
    getline(std::cin, s);
    m.generate(s);
    cout << "The meassage after encrypted is: " << m.toString() << endl;
}

猜你喜欢

转载自blog.csdn.net/qq_34035179/article/details/78317307