Why are Gaussian "discriminant" analysis models called so?

by highBandWidth   Last Updated October 12, 2018 15:19 PM

Gaussian discriminant analysis models learn $P(x|y)$ and then apply Bayes rule to evaluate $$P(y|x) = \frac{P(x|y)P_{prior}(y)}{\Sigma_{g \in Y} P(x|g) P_{prior}(g) }.$$ Hence, they are generative models. Why then is it called discriminant analysis? If it is because we finally derive a discriminant curve between the classes, then that happens for all generative models.



Answers 2


If you mean LDA I would say the name, linear discriminant analysis, can be explained historically dating back at least to Fisher's paper from 1936, which, to the best of my knowledge, precedes the current terminology and distinction in machine learning between a discriminative and a generative model. Not that Fisher called it linear discriminant analysis directly, but he did explicitly ask for a linear function for discrimination. As a curious side remark, Fisher considered discrimination for the famous Iris data set in the paper.

Fisher did, by the way, not present the linear method for discrimination in terms of a generative model. He sought a linear combination (for two classes) that maximizes the ratio of the between-group variance to the within-group variance, which does not require a normality assumption. Details, and how it relates to LDA as a Bayes rule for a generative model, can be found in Chapter 3 in Brian Ripley's book "Pattern Recognition and Neural Networks".

NRH
NRH
September 01, 2011 19:28 PM

It is simple, in case you have two classes $(Y=0 , Y=1)$, the GDA makes use of this assumption:

  1. $P(X|Y=0) \sim \mathcal{N}(\mu_0,\Sigma_0) $
  2. $P(X|Y=1) \sim \mathcal{N}(\mu_1,\Sigma_1)$
  3. $P(Y=1)=1-P(Y=0)=\Phi$

And then gets the parameters $(\mu_0,\Sigma_0,\mu_1,\Sigma_1,\Phi)$ using maximum likelihood estimation.

So it's Gaussian because it uses a gaussian assumption for the intra-goup distribution (you may want to use uniform instead for ex) and discriminant because it aims to separate data into groups.

You can find more info here.

dfhgfh
dfhgfh
January 07, 2013 17:17 PM

Related Questions



Covariance matrix for Linear Discriminant Analysis

Updated February 18, 2017 06:19 AM

Learning normal distribution with VAE

Updated October 06, 2017 16:19 PM


GDA and LDA terminology

Updated February 28, 2017 20:19 PM