Tool to calculate the Shannon index. The Shannon index is a measure of entropy for characters strings (or any computer data)
Shannon Index - dCode
Tag(s) : Informatics, Cryptanalysis
dCode is free and its tools are a valuable help in games, maths, geocaching, puzzles and problems to solve every day!
A suggestion ? a feedback ? a bug ? an idea ? Write to dCode!
Shannon's entropy index is a measure of the entropy, that applies to any numerical data, developed by Claude Shannon in the 1940s. It measures the frequencies of appearance of the items, and the more they are different, the more difficult it will be to predict the content (thus a greater uncertainty, more randomness, and thus a greater entropy).
Entropy is calculated from a list of elements: in a text, the elements will be the characters and in an array of numeric values, the elements will be the numbers.
For a string of characters with $ N $ items with $ k $ distinct, each element $ i $ having a number of occurence $ n_i $ and a frequency of appearance $ p_i ( = n_i/N ) $. The entropy of Shannon $ H $ is calculated according to the formula $$ H = -\sum_{i=1}^k p_i \log_2 (p_i) $$
Example: DCODE has 5 characters (4 distinct), the letter D appears 2 times (frequency: 2/5), and the 3 letters C, O and E each appear 1 time (frequency: 1/5), the calculation is: $ H = -\left( \frac{2}{5} \log_2(\frac{2}{5}) + 3 \times \frac{1}{5} \log_2(\frac{1}{5}) \right) \approx 1.921928 $
The value is always positive, the logarithms of numbers less than 1 are always negative, their sum too, the sign - makes it possible to obtain a positive result.
From the Shannon index, the optimal encoding of a string can be deduced. If the Shannon index of a string is 3.5, then it will take 4 bits (rounded up) by characters to encode it optimally. The Shannon index can then be useful for evaluating a compression ratio, the higher the entropy, the better the compression.
Shannon's entropy is measured in bits.
dCode retains ownership of the "Shannon Index" source code. Any algorithm for the "Shannon Index" algorithm, applet or snippet or script (converter, solver, encryption / decryption, encoding / decoding, ciphering / deciphering, breaker, translator), or any "Shannon Index" functions (calculate, convert, solve, decrypt / encrypt, decipher / cipher, decode / encode, translate) written in any informatic language (Python, Java, PHP, C#, Javascript, Matlab, etc.) or any database download or API access for "Shannon Index" or any other element are not public (except explicit open source licence like Creative Commons). Same with the download for offline use on PC, mobile, tablet, iPhone or Android app.
Reminder: dCode is an educational and teaching resource, accessible online for free and for everyone.
The content of the page "Shannon Index" and its results may be freely copied and reused, including for commercial purposes, provided that dCode.fr is cited as the source.
Exporting the results is free and can be done simply by clicking on the export icons ⤓ (.csv or .txt format) or ⧉ (copy and paste).
To cite dCode.fr on another website, use the link:
In a scientific article or book, the recommended bibliographic citation is: Shannon Index on dCode.fr [online website], retrieved on 2025-04-15,