How to calculate the entropy of a file?

  • At the end: Calculate the “average” value for the array.
  • Initialize a counter with zero,
    and for each of the array’s entries:
    add the entry’s difference to “average” to the counter.

With some modifications you can get Shannon’s entropy:

rename “average” to “entropy”

(float) entropy = 0
for i in the array[256]:Counts do 
  (float)p = Counts[i] / filesize
  if (p > 0) entropy = entropy - p*lg(p) // lgN is the logarithm with base 2

Edit:
As Wesley mentioned, we must divide entropy by 8 in order to adjust it in the range 0 . . 1 (or alternatively, we can use the logarithmic base 256).

Leave a Comment