<?xml version="1.0" encoding="UTF-8"?>
<feed xmlns="http://www.w3.org/2005/Atom" xmlns:dw="https://www.dreamwidth.org">
  <id>tag:dreamwidth.org,2009-05-05:290754</id>
  <title>The Lair of the SlowFox</title>
  <subtitle>slowfox</subtitle>
  <author>
    <name>slowfox</name>
  </author>
  <link rel="alternate" type="text/html" href="https://slowfox.dreamwidth.org/"/>
  <link rel="self" type="text/xml" href="https://slowfox.dreamwidth.org/data/atom"/>
  <updated>2009-12-16T08:26:17Z</updated>
  <dw:journal username="slowfox" type="personal"/>
  <entry>
    <id>tag:dreamwidth.org,2009-05-05:290754:83181</id>
    <link rel="alternate" type="text/html" href="https://slowfox.dreamwidth.org/83181.html"/>
    <link rel="self" type="text/xml" href="https://slowfox.dreamwidth.org/data/atom/?itemid=83181"/>
    <title>MD5</title>
    <published>2009-12-16T08:26:17Z</published>
    <updated>2009-12-16T08:26:17Z</updated>
    <category term="security"/>
    <category term="md5"/>
    <dw:mood>calm</dw:mood>
    <dw:security>public</dw:security>
    <dw:reply-count>9</dw:reply-count>
    <content type="html">MD5 is a hashing algorithm that attempts to verify that a given file's integrity hasn't been compromised (during a download process, or in the process of being copied from A to B etc).&lt;br /&gt;&lt;br /&gt;Typically, you'll have a file reference given on a web-site and, somewhere else (this is important, more later), there'll be a long string of seemingly random characters called the MD5 Sum, or MD5 Hash or whatever:&lt;br /&gt;&lt;br /&gt;&lt;tt&gt;c59b048c992804d165aed10170f003dc&lt;/tt&gt;&lt;br /&gt;&lt;br /&gt;What happens is that the algorithm works through the source file, and maps the first element to something new, and then feeds that result into part of the computation for the second element, and then the result of &lt;i&gt;that&lt;/i&gt; into the third element etc. When it reaches the end of the file, the whole result gets fed back into the algorithm again. And again, and again for a specified number of iterations (loops). The end result of these calculations boils down to the specific MD5 Sum for the file.&lt;br /&gt;&lt;br /&gt;I use &lt;a href="http://www.nullriver.com/index/products/winmd5sum"&gt;WinMD5Sum&lt;/a&gt; on the work PC:&lt;br /&gt;&lt;a href="http://www.flickr.com/photos/27479072@N07/4189902246/" title="md5screenshot by slowfox999, on Flickr"&gt;&lt;img src="http://farm3.static.flickr.com/2753/4189902246_9a7d136536_o.jpg" width="426" height="196" alt="md5screenshot" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;The idea behind MD5 hashing is that even tiny changes in the original file result in significant differences in the end MD5 hash.&lt;br /&gt;&lt;br /&gt;For example, I created a text file called &lt;tt&gt;hello.txt&lt;/tt&gt;, which contained the phrase &lt;tt&gt;Hello, world.&lt;/tt&gt;.&lt;br /&gt;&lt;br /&gt;The MD5 sum of this worked out to be &lt;tt&gt;45d2c2d506211d17f99a3eb8de863f36&lt;/tt&gt;&lt;br /&gt;&lt;br /&gt;By changing the &lt;i&gt;last&lt;/i&gt; character to a comma - &lt;tt&gt;Hello, world,&lt;/tt&gt; - the MD5 sum changed to &lt;tt&gt;c59b048c992804d165aed10170f003dc&lt;/tt&gt;, immediately telling me that the file's changed from the original. &lt;br /&gt;&lt;br /&gt;An MD5 sum is &lt;i&gt;always&lt;/i&gt; 32 characters long, yet can be generated for files of any size. Pretty obviously, then, there will be various different kinds of files that result in the same hash - these are known (with the industry's predictable fondness for dramatic vernacular) as 'collisions'. Nonetheless, the risk of collision is pretty small (if, for example, by repeatedly going through the MD5 hashing algorithm, all files eventually boiled down to a single value, it'd obviously be useless) - that said, boffins have managed, now, to successfully construct amended files that generate the same MD5 hash, but this requires a fair bit of work and an accommodating starting point. &lt;br /&gt;&lt;br /&gt;Anyway, the idea is that you see a file - &lt;tt&gt;exciting_prog.exe&lt;/tt&gt; - on a website and download it. By using an MD5 checksum, you can verify (beyond reasonable doubt) that the file you've downloaded is the file that the website &lt;i&gt;intended&lt;/i&gt; you to download.&lt;br /&gt;&lt;br /&gt;(which, by the way, is a loooooooooooooooooong way from saying it's safe).&lt;br /&gt;&lt;br /&gt;The &lt;b&gt;major&lt;/b&gt; caveat should be obvious: the situation where the file's MD5 sum is listed in &lt;i&gt;the same directory as the file itself&lt;/i&gt;. Consider: if nefarious malfeasants have managed to hack the server and place a malicious file in the benign one's place, then since they've clearly got access to the server, it'd be trivial for them to also replace the posted MD5 sum with the sum to match their own malicious file.&lt;br /&gt;&lt;br /&gt;&lt;img src="https://www.dreamwidth.org/tools/commentcount?user=slowfox&amp;ditemid=83181" width="30" height="12" alt="comment count unavailable" style="vertical-align: middle;"/&gt; comments</content>
  </entry>
</feed>
