MD4 (Message Digest 4) is a hash function developed by University of Massachusetts professor Ronald Rivest in 1990 and first described in RFC 1186.

For an arbitrary input message, the function generates a 128-bit hash value called the message digest. This algorithm is used in the MS-CHAP authentication protocol, developed by Microsoft, to authenticate remote Windows workstations. A predecessor to MD5.

MD4 Algorithm

It is assumed that the input is a message consisting of b bits, the hash of which we have to calculate. Here b is an arbitrary non-negative integer. It can be zero, does not have to be a multiple of eight, and can be arbitrarily large.

Following are the 5 steps used to calculate the hash of a message.

Step 1. Adding missing bits.

The message is expanded so that its length in bits modulo 512 is 448. Thus, as a result of the expansion, the message lacks 64 bits to a length that is a multiple of 512 bits. Expansion is always performed even if the message initially has the desired length.

The expansion is done as follows: one bit equal to 1 is added to the message, and then bits equal to 0 are added until the length of the message is 448 modulo 512. In the end, at least 1 bit is added to the message, and as a maximum of 512

Step 2. Adding the length of the message.

The 64-bit representation of b is added to the result of the previous step. In the unlikely event that b is greater than 2 to the 64th power, only the least significant 64 bits are used. These bits are added as two 32-bit words, and the word containing the least significant bits is appended first.

At this stage (after adding the bits and the message length), we get a message that is a multiple of 512 bits. It is equivalent to the message being a multiple of 16 32-bit words. Let M [0 … N-1] denote an array of words of the resulting message (where N is a multiple of 16).

Step 3. Initializing the MD buffer.

To calculate the message hash buffer consisting of 4 words (A, B, C, D) (32-bit registers) is used. These registers are initialized with the following hexadecimal numbers (LSB first):

  • word A: 01 23 45 67;
  • word B: 89 ab cd ef;
  • word C: fe dc ba 98;
  • word D: 76 54 32 10.

Step 4. Processing the message in blocks of 16 words.

First, let’s define three auxiliary functions, each of which receives three 32-bit words as input and calculates one 32-bit word from them.

For each bit position, F acts as a conditional expression: if X, then Y; otherwise Z. The function F could have been defined using instead of V since XY and XZ cannot equal 1, at the same time. G acts on each bit position as a function of the maximum value. If at least two words of X, Y, and Z have corresponding bits equal to 1 then G will output a 1 in that bit. Otherwise, G will output a bit equal to 0. It is interesting to note that if the X, Y, and Z bits are statistically independent, then the F (X, Y, Z) and G (X, Y, Z) bits will also be statistically independent. Function H implements bitwise xor and has the same property as F and G.

Step 5. Forming a hash.

The result (hash function) is obtained as ABCD. It means we write out 128 bits, starting with the least significant bit of A and ending with the most significant bit of D.

Example

MD4 (“The quick brown fox jumps over the lazy dog”)

 = 1bee69a46ba811185c194762abaeae90

128-bit MD4 hashes are 32-digit hexadecimal numbers. The following example shows the hash of a 43-byte ASCII string.

Safety

The level of security laid down in MD4 was designed to create fairly stable hybrid electronic digital signature systems based on MD4 and a public-key cryptosystem. Ronald Rivest believed that the MD4 hashing algorithm can be used for systems that need strong cryptographic strength. But at the same time, he noted that MD4 was created primarily as a very fast hashing algorithm. So it can be poor, in terms of cryptographic strength. As subsequent studies showed, he was right, and for applications where cryptographic strength is important. The MD5 algorithm began to be used.

Vulnerabilities

Vulnerabilities in MD4 were demonstrated in an article by Bert den Boer and Anton Bosselars in 1991. The first collision was founded by Hans Dobbertin, in 1996.