The Weaknesses of MD5, SHA1 and SHA-256 — The Length Extension Attack

Prof Bill Buchanan OBE FRSE
3 min readMar 15, 2021

The foundation of trust in cybersecurity is layed by the simple concept of data hashing, and where we take data and create a fixed-length hash for the data. If we cannot trust our hashing methods, we are in trouble. When we creating the perfect message hash, we thus need to make sure we have:

  • Collision resistance. This is where it is extremely difficult to find two messages which have the same hash. Thus we should not be able to find the has of two messages (M1, and M2) that are the same, within a reasonable time: H(M1)=H(M2).
  • Pre-image resistance. If we already have a hash value (h), it should be extremely difficult to find a message which will give the same hash. Thus for a given hash (h), it is difficult to find a message (M1) for H(M1)=h.

The original hash methods were often based on the Merkle-Damgård (MD) construction. With this, we create a hash function using blocks of data. Based on the MD constuct, Ron Rivest created the MD5 hashing method, and it was widely adopted in the industry. It works by taking a static initialisation vector (IV) and then feeding this into a one-way function (f), along with a block of the message. We feed this output into the next stage, and so on until we get to a message pad at the end:

--

--

Prof Bill Buchanan OBE FRSE

Professor of Cryptography. Serial innovator. Believer in fairness, justice & freedom. Based in Edinburgh. Old World Breaker. New World Creator. Building trust.