MathJax

Friday, February 13, 2015

Entropy maximization

One interesting and well known result is that out of all finite alphabet distributions, the uniform distribution achieves maximum entropy.

The Laplace's principle of insufficient reasoning, calls for assuming uniformity unless there is additional information.

Entropy maximization is the equivalent of minimizing the KL-Divergence between the distribution \(p\) and the uniform distribution.

More precisely, let \(\mathcal{X} \in \{\alpha_1, \cdots, \alpha_n\}\) be a random variable with finite alphabet, given a family of distribution \(\mathcal{P}\) and the uniform distribution \(u\),
\[ \underset{p\in\mathcal{P}}{\text{arg min}} \;D_{KL}(p\| u) = \underset{p\in\mathcal{P}}{\text{arg max}} \;H(p)\] where \(H(p)\) is the entropy.

Proof:
\[ D(p\|u) = \sum_i p_i \log p_i +  (\sum_i p_i) \log (n) = \log (n) - H(p) \]
Note
This is true in general for random variables with finite support.  For RVs with infinite support,  additional constraint is required.  See Gibbs measure...

No comments: