Clarification on LDA and the multivariate Gaussian

by James   Last Updated October 12, 2018 15:19 PM

From my understanding, to calculate the posterior probability of a sample $x$ belonging to a class $k$ using Linear Discriminant Analysis you would first calculate the eigenvector matrix $W$ required to transform any given sample into the $p$-dimensional discriminant space. Then, using $W$, you would project $x$ and the mean for the class $k$ training data ($\mu_k$) into the discriminant space to get $x'$ and $\mu_k'$. You would then use the results of these projections to calculate $P(x|k)$ via the multivariate Gaussian: $$P(x|k)=\frac{1}{\sqrt{(2\pi)^p|\boldsymbol\Sigma|}} \exp\left(-\frac{1}{2}({x'}-{\mu_k'})^T{\boldsymbol\Sigma}^{-1}({x'}-{\mu_k'}) \right)$$

Next, you would use Bayes Theorem to estimate the proportionate probability $\widehat{P(k|x)} = P(k)P(x|k)$ given $P(k) = N_k / N$ with $N_k$ being to total number of training samples in class $k$ and $N$ being the total number of training samples. Finally you would normalize to get an actual estimate of $P(k|x)$ via: $$P(k|x) = {\widehat{P(k|x)}} / {\sum_{i\epsilon N} \widehat{P(i|x)}}$$

In the answer to Linear discriminant analysis and Bayes rule it is noted that if the pooled within-class covariance matrix is used for $\boldsymbol\Sigma$ then $|\boldsymbol\Sigma| = 1$ and the squared Mahalanobis distance $d = ({x'}-{\mu_k'})^T{\boldsymbol\Sigma}^{-1}({x}-{\mu_k'})$ simplifies to become the Euclidean distance $d = ({x'}-{\mu_k'})^T({x}-{\mu_k'})$. I'm a little confused as to how this simplification works and have been unable to find any good references that demonstrate it. Also, would that mean $\boldsymbol\Sigma = \boldsymbol{S_w}$ used to calculate the discriminant space or is it something else altogether? Any help would be greatly appreciated.

Related Questions

Derivation of LDA decision boundary

Updated August 25, 2017 02:19 AM

Fisher LDA is a Bayes Classifier?

Updated July 23, 2015 13:08 PM