# Sufficient Statistic for non-exponential family distribution

by dimebucker91   Last Updated April 18, 2018 10:19 AM

Question: Let $X_1,X_2,....X_n$ be an iid sample from $N(\theta , 4 \theta^2 )$. I want to show that this model is not a member of the exponential family and to find a sufficient statistic for $\theta$

Attempt: \begin{align*} f(~\underline{x}~;\theta) &= \prod_{i=1}^n \frac{1}{\sqrt{8 \pi \theta^2}}\exp(\frac{-1}{8 \theta^2}\sum_{i=1}^n (x_i - \theta)^2)\\ &=\exp \left(\ln(8\pi \theta^2)^{-n/2}- \frac{1}{8 \theta^2}\sum_{i=1}^n x_i^2 + \frac{1}{4 \theta} \sum_{i=1}^n x_i - \frac{n}{8}\right) \end{align*}

So clearly this is not a member of the exponential family as it is the representation of a two dimensional exponential family, but we only have one parameter.

I am struggling to find a sufficient statistic however, can I have a two dimensional statistic if I am estimating one parameter?

Update

So after doing a similar question I am fairly certain that a sufficient statistic is given by: $S=(S_1,S_2) =(\sum_{i=1}^n x_i^2,\sum_{i=1}^n x_i)$. So i guess my question just boils down to how can we have a two dimensional statistic to estimate one parameter, seems counter intuitive?

Also, I've learned that this is a member of the curved exponential family, a further generalization of the exponential family.

Tags :

First this is an exponential family since the density writes down as $$\exp\{\Phi_1(\theta) S_1({\mathbf x})+\Phi_2(\theta) S_2({\mathbf x})-\Psi(\theta)\}$$against a particular dominating measure. That the two coefficients $\Phi_1(\theta)$ and $\Phi_2(\theta)$ are connected with a functional relation is not an issue: they also both depend deterministically on $\theta$. The fact that $\theta$ is one-dimensional and the family is two-dimensional is a case of curved exponential families (see Brown, 1986). This family can be extended to a truly two-dimensional parameter space, of which the ${\cal N}(\theta, 4\theta²)$ is a special case. Or a curve like $\Psi_1=\Psi_2^2/2$ in the extended parameter space.
Another reason for this distribution to be from an exponential family is that there exists a sufficient statistic of dimension two, whatever the sample size $n$ is. By the Darmois-Pitman-Koopman lemma this can only occur in an exponential family.