Dave's Journal: July 2016

Thursday, July 21, 2016

Invariance and carry over properties of MLE

Review: Asymptotic properties of MLE

Asymptotically efficient (attains CRLB as \(N\rightarrow\infty\))
Asymptotically Gaussian (asymptotically normality)
Asymptotically Unbiased
Consistent (weakly and strongly)

First, the invariance property of MLE

The MLE of the parameter \(\alpha = g(\theta)\), where the PDF \(p(x;\theta)\) is paremeterized by \(\theta\), is given by
\[ \hat{\alpha} = g(\hat{\theta})\] where \(\hat{\theta}\) is the MLE of \(\theta\).

Consistency (in class) is defined as the weak convergence of the sequence of estimates to the true parameter as N gets large.

If \(g(\theta)\) is continuous in \(\theta\), the convergence properties (esp. convergence in prob.) carry over, i.e. the consistency of the estimator \(g(\hat{\theta})\)

However, biasedness of the estimator \(g(\hat{\theta})\) depends on the convexity of \(g\) and does not carry over from \(\hat{\theta}\).

Other properties of MLE

If an efficient estimator exists, the ML method will produce it.
Unlike the MVU estimator, MLE can be biased
Note: CRLB applies to unbiased estimators, so when estimator is biased, it is possible it has variance smaller than \(I^{-1}(\theta)\)

Thursday, July 14, 2016

Properties of a regular family of parameterized distribution

A family of parameterized distribution defined by

\[ \mathcal{P} = \{ p_\theta(y) | \theta \in \Theta \subset \mathbb{R}^P \}\]

is regular if it satisfies the following conditions

Support of \(p_\theta(y)\) does not depend on \(\theta\) for all \(\theta \in \Theta\)
\(\frac{\partial}{\partial \theta} p_\theta(y) \) exists
Optional \( \frac{\partial^2}{\partial \theta^2} p_\theta(y) \) exists

Note \[ \frac{\partial }{ \partial \theta } \ln p_\theta(y) = \frac{1}{p_\theta(y) } \frac{\partial }{ \partial \theta } p_\theta(y) \quad \quad (4) \]

Define the score function (log := natural log)

\[ S_\theta (y) := \nabla_\theta \log p_\theta(y) \]

Note also

\[ E_\theta \{ 1 \} = 1 = \int_\mathcal{Y} p_\theta(y) dy \]

As a result of the above we have (Kay's definition of regular)
\begin{align*} 0 &= \frac{\partial }{ \partial \theta }E_\theta \{1\} \\
&= \frac{\partial }{ \partial \theta }\int p_\theta(y) dy \\
&\overset{1}{=} \int \frac{\partial }{ \partial \theta } p_\theta(y) dy \\
&\overset{2,4}{=} \int p_\theta(y) \frac{\partial }{ \partial \theta } \log p_\theta(y) dy \\
&= E_\theta \{ S_\theta(y) \}
\end{align*}

Friday, July 01, 2016

Spectral Theorem for Diagonalizable Matrices

It occurs to me that most presentation of the spectrum theorem only concerns orthonormal basis. This is a more general result from Meyer.

Theorem

A matrix \( \mathbf{A} \in \mathbb{R}^{n\times x}\) with spectrum \(\sigma(\mathbf{A}) = \{ \lambda_1, \dotsc, \lambda_k \} \) is diagonalizable if and only if there exist matrices \(\{ \mathbf{G}_1, \dotsc, \mathbf{G}_k\} \) such that \[ \mathbf{A} = \lambda_1 \mathbf{G}_1 + \dotsb + \lambda_k \mathbf{G}_k \] where the \(\mathbf{G}_i\)'s have the following properties

\(\mathbf{G}_i\) is the projector onto \(\mathcal{N} (\mathbf{A} - \lambda_i \mathbf{I}) \) along \(\mathcal{R} ( \mathbf{A} - \lambda_i \mathbf{I} ) \).

\(\mathbf{G}_i\mathbf{G}_j = 0 \) whenever \( i \neq j \)

\( \mathbf{G}_1 + \dotsb + \mathbf{G}_k = 1\)

The expansion is known as the spectral decomposition of \(\mathbf{A}\), and the \(\mathbf{G}_i\)'s are called the spectral projectors associated with \(\mathbf{A}\).

Note that being a projector \(\mathbf{G}_i\) is idempotent.

\(\mathbf{G}_i = \mathbf{G}_i^2\)

And since \(\mathcal{N}(\mathbf{G}_i) = \mathcal{R}(\mathbf{A} - \lambda_i \mathbf{I} ) \) and \(\mathcal{R}(\mathbf{G}_i) = \mathcal{N}(\mathbf{A} - \lambda_i \mathbf{I} ) \), we have the following equivalent complimentary subspaces

\( \mathcal{R}(\mathbf{A} - \lambda_i \mathbf{I} ) \oplus \mathcal{N}(\mathbf{A} - \lambda_i \mathbf{I} ) \)
\( \mathcal{R}(\mathbf{G}_i) \oplus \mathcal{N}(\mathbf{A} - \lambda_i \mathbf{I} ) \)
\( \mathcal{R}(\mathbf{A} - \lambda_i \mathbf{I} ) \oplus \mathcal{N}(\mathbf{G}_i) \)
\( \mathcal{R}(\mathbf{G}_i) \oplus \mathcal{N}(\mathbf{G}_i) \)