Crea sito

03jun14 - PHP: FiLE HASHiNG

Sometimes, sooner or later, you may need to calculate file hashing, to know if there are changes, or for admin reason, or to check if it was downloaded correctly, or whatever...
This is not supposed to be a post about security, the question is: there are over 40 algorithms to choose with PHP5, which is the best one for what i need?
We just ran for fun a "benchmark" test, hashing two iso images, 650MB and 5GB.
Tests were done on a Debian 6 guest, running on Virtualbox, Windows7 x64 host, nothing special, no huge power.
Just for statistics purposes the 650MB file took 4 to 112 sec to generate the hashing string, and 5GB took 69 to 802 secs, but we avoid showing times on results table, because is depending from a ton of external factors (drive speed, resources, local or remote file, and so on).
Results are in percentage, compared to 100%, the top one.

PHP code:
hash_file($algorithm,$input_file);

SPEED:
Md4 was the fastest one on our test platform, followed close by crc32b, then md5.
Crc32b is quick but small, it can be enough for a lot of purposes (because 8 bytes hash length means a collision chance of 1:4.294.967.295), but md4 still the fastest and it results on a 32 bytes hash length.
On our test platform we noticed that md5 is a bit more quick than md4 when file size increases, so probably is always the best average choice.

EFFiCiENCY:
we also want to show what's the most efficient, meaning the quickest to generate hash, but considering also hash length, strength.
Salsa20 won (128 bytes), almost the same performance than its little brother salsa10, and it goes better when file size increases.
Notice that both salsa algorithm are on top10 speed table, with almost the same performances of the widely used sha1 (but 128 bytes vs 40).

Our winner then is salsa20.
Feel free to pick your best algorithm from this table.

Thanks for reading, LuMa23.