MathJax

Thursday, July 21, 2016

Invariance and carry over properties of MLE

Review: Asymptotic properties of MLE
  • Asymptotically efficient (attains CRLB as \(N\rightarrow\infty\))
  • Asymptotically Gaussian (asymptotically normality)
  • Asymptotically Unbiased
  • Consistent (weakly and strongly)
First, the invariance property of MLE

The MLE of the parameter \(\alpha = g(\theta)\), where the PDF \(p(x;\theta)\) is paremeterized by \(\theta\), is given by
\[ \hat{\alpha} = g(\hat{\theta})\] where \(\hat{\theta}\) is the MLE of \(\theta\).

Consistency (in class) is defined as the weak convergence of the sequence of estimates to the true parameter as N gets large.

If \(g(\theta)\) is continuous in \(\theta\), the convergence properties (esp. convergence in prob.) carry over, i.e. the consistency of the estimator \(g(\hat{\theta})\)

However, biasedness of the estimator \(g(\hat{\theta})\) depends on the convexity of \(g\) and does not carry over from \(\hat{\theta}\).

Other properties of MLE
  • If an efficient estimator exists, the ML method will produce it.
  • Unlike the MVU estimator, MLE can be biased
  • Note: CRLB applies to unbiased estimators, so when estimator is biased, it is possible it has variance smaller than \(I^{-1}(\theta)\)

Thursday, July 14, 2016

Properties of a regular family of parameterized distribution

A family of parameterized distribution defined by
\[ \mathcal{P} = \{ p_\theta(y)  | \theta \in \Theta \subset \mathbb{R}^P \}\]
is regular if it satisfies the following conditions
  1. Support of \(p_\theta(y)\) does not depend on \(\theta\) for all \(\theta \in \Theta\)
  2. \(\frac{\partial}{\partial \theta} p_\theta(y) \) exists
  3. Optional \( \frac{\partial^2}{\partial \theta^2} p_\theta(y) \) exists
Note \[ \frac{\partial }{ \partial \theta } \ln p_\theta(y)  = \frac{1}{p_\theta(y) } \frac{\partial }{ \partial \theta } p_\theta(y) \quad \quad (4) \]
Define the score function (log := natural log)
\[  S_\theta (y) := \nabla_\theta \log p_\theta(y) \]
Note also
\[ E_\theta \{ 1 \} = 1 = \int_\mathcal{Y} p_\theta(y) dy \]
As a result of the above we have (Kay's definition of regular)
\begin{align*}  0 &= \frac{\partial }{ \partial \theta }E_\theta \{1\} \\
&= \frac{\partial }{ \partial \theta }\int p_\theta(y) dy \\
&\overset{1}{=} \int \frac{\partial }{ \partial \theta }  p_\theta(y) dy \\
&\overset{2,4}{=} \int p_\theta(y)  \frac{\partial }{ \partial \theta } \log p_\theta(y) dy \\
&= E_\theta \{ S_\theta(y) \}
\end{align*}

Friday, July 01, 2016

Spectral Theorem for Diagonalizable Matrices

It occurs to me that most presentation of the spectrum theorem only concerns orthonormal basis.  This is a more general result from Meyer.

Theorem
A matrix \( \mathbf{A} \in \mathbb{R}^{n\times x}\) with spectrum \(\sigma(\mathbf{A}) = \{ \lambda_1, \dotsc, \lambda_k \} \) is diagonalizable if and only if there exist matrices \(\{ \mathbf{G}_1, \dotsc, \mathbf{G}_k\} \) such that  \[ \mathbf{A} = \lambda_1 \mathbf{G}_1 + \dotsb + \lambda_k  \mathbf{G}_k \] where the \(\mathbf{G}_i\)'s have the following properties
  • \(\mathbf{G}_i\) is the projector onto \(\mathcal{N} (\mathbf{A} - \lambda_i \mathbf{I})  \) along \(\mathcal{R} ( \mathbf{A} - \lambda_i \mathbf{I} ) \). 
  • \(\mathbf{G}_i\mathbf{G}_j = 0 \) whenever \( i \neq j \)
  • \( \mathbf{G}_1 + \dotsb + \mathbf{G}_k = 1\)
The expansion is known as the spectral decomposition of \(\mathbf{A}\), and the \(\mathbf{G}_i\)'s are called the spectral projectors associated with \(\mathbf{A}\).

Note that being a projector \(\mathbf{G}_i\) is idempotent.
  • \(\mathbf{G}_i = \mathbf{G}_i^2\)
And since \(\mathcal{N}(\mathbf{G}_i) = \mathcal{R}(\mathbf{A} - \lambda_i \mathbf{I} ) \) and \(\mathcal{R}(\mathbf{G}_i) = \mathcal{N}(\mathbf{A} - \lambda_i \mathbf{I} ) \), we have the following equivalent complimentary subspaces
  • \(  \mathcal{R}(\mathbf{A} - \lambda_i \mathbf{I} ) \oplus \mathcal{N}(\mathbf{A} - \lambda_i \mathbf{I} ) \)
  • \(  \mathcal{R}(\mathbf{G}_i) \oplus \mathcal{N}(\mathbf{A} - \lambda_i \mathbf{I} ) \)
  • \(  \mathcal{R}(\mathbf{A} - \lambda_i \mathbf{I} ) \oplus  \mathcal{N}(\mathbf{G}_i) \)
  • \(  \mathcal{R}(\mathbf{G}_i)  \oplus  \mathcal{N}(\mathbf{G}_i) \)