MathJax

Thursday, February 12, 2015

A Side Path to Statistical Mechanics

Consider a physical system with many degrees of freedom, that can reside in any one of a large number of possible states.  Let \(p_i\) denote the probability of occurrence of state \(i\), for example, with the following properties:
\[p_i\ge 0 \quad \text{for all}\; i\]and 
\[\sum_i p_i = 1\] Let \(E_i\) denote the energy of the system when it is in state \(i\).  A fundamental results from statistical mechanics tells us that when the system is in thermal equilibrium with its surrounding environment, statie \(i\) occurs with a probability define by 
\[p_i = \frac{1}{Z}\exp(-\frac{E_i}{k_BT})\] where \(T\) is the absolute temperature in kelvins, \(k_B\) is the Boltzmann's constant, and \(Z\) is a constant that is independent of all states.  The partition function \(Z\) is the normalizing constant with 
\[Z= \sum_i \exp(-\frac{E_i}{k_BT}).\]  The probability distribution is called the Gibbs distribution.

Two interesting properties of the Gibbs distribution are:
  1. States of low energy have a higher probability of occurrence than states of high energy.
  2. As the temperature \(T\) is reduced, the probability is concentrated on a smaller subset of low-energy states.
In the context of neural networks, the parameter \(T\) may be viewed as a pseudo-temperature that controls thermal fluctualtions representing the effect of "synaptic noise" in a neuron.  Its precise scale is irrelevant.  We can redefine the probability \(p_i\) and partition function Z as
\[p_i = \frac{1}{Z}\exp(-\frac{E_i}{T}) \] and
\[Z = \sum_i \exp(-\frac{E_i}{T})\] where \(T\) is referred to simply as the temperature of the system. 

Note that \(-\log p_i\) may be viewed as a form of "energy" measured at unit temperature.

Free Energy and Entropy

The Helmholtz free energy of a physical system, denoted by \(F\), is defined in terms of the partition function \(Z\) as follows
\[F = -T \log Z.\]  The average energy of the system is defined by
\[\lt E\gt = \sum_i p_i E_i\] The difference between the average energy and free energy is
\[\lt E \gt - F = -T\sum_i p_i \log p_i\] which we can rewrite in terms of entropy \(H\)
\[\lt E \gt - F = T H\] or, equivalently,
\[F = \lt E \gt - TH\].  The entropy of any systems tend to increase until it reaches an equilibrium, and therefore the free energy of the system will reach a minimum.

This is an important principle called the principle of minimal free energy.

No comments: