Dec. 16th, 2009

MD5

Dec. 16th, 2009 08:01 am
slowfox: Slowfox' default icon (Default)
MD5 is a hashing algorithm that attempts to verify that a given file's integrity hasn't been compromised (during a download process, or in the process of being copied from A to B etc).

Typically, you'll have a file reference given on a web-site and, somewhere else (this is important, more later), there'll be a long string of seemingly random characters called the MD5 Sum, or MD5 Hash or whatever:

c59b048c992804d165aed10170f003dc

What happens is that the algorithm works through the source file, and maps the first element to something new, and then feeds that result into part of the computation for the second element, and then the result of that into the third element etc. When it reaches the end of the file, the whole result gets fed back into the algorithm again. And again, and again for a specified number of iterations (loops). The end result of these calculations boils down to the specific MD5 Sum for the file.

I use WinMD5Sum on the work PC:
md5screenshot

The idea behind MD5 hashing is that even tiny changes in the original file result in significant differences in the end MD5 hash.

For example, I created a text file called hello.txt, which contained the phrase Hello, world..

The MD5 sum of this worked out to be 45d2c2d506211d17f99a3eb8de863f36

By changing the last character to a comma - Hello, world, - the MD5 sum changed to c59b048c992804d165aed10170f003dc, immediately telling me that the file's changed from the original.

An MD5 sum is always 32 characters long, yet can be generated for files of any size. Pretty obviously, then, there will be various different kinds of files that result in the same hash - these are known (with the industry's predictable fondness for dramatic vernacular) as 'collisions'. Nonetheless, the risk of collision is pretty small (if, for example, by repeatedly going through the MD5 hashing algorithm, all files eventually boiled down to a single value, it'd obviously be useless) - that said, boffins have managed, now, to successfully construct amended files that generate the same MD5 hash, but this requires a fair bit of work and an accommodating starting point.

Anyway, the idea is that you see a file - exciting_prog.exe - on a website and download it. By using an MD5 checksum, you can verify (beyond reasonable doubt) that the file you've downloaded is the file that the website intended you to download.

(which, by the way, is a loooooooooooooooooong way from saying it's safe).

The major caveat should be obvious: the situation where the file's MD5 sum is listed in the same directory as the file itself. Consider: if nefarious malfeasants have managed to hack the server and place a malicious file in the benign one's place, then since they've clearly got access to the server, it'd be trivial for them to also replace the posted MD5 sum with the sum to match their own malicious file.

Profile

slowfox: Slowfox' default icon (Default)
slowfox

Style Credit

Expand Cut Tags

No cut tags
Page generated Jul. 23rd, 2025 11:05 pm
Powered by Dreamwidth Studios