Sufficient Statistic for non-exponential family distribution

by dimebucker91   Last Updated April 18, 2018 10:19 AM

Question: Let $X_1,X_2,....X_n$ be an iid sample from $N(\theta , 4 \theta^2 )$. I want to show that this model is not a member of the exponential family and to find a sufficient statistic for $\theta$

Attempt: \begin{align*} f(~\underline{x}~;\theta) &= \prod_{i=1}^n \frac{1}{\sqrt{8 \pi \theta^2}}\exp(\frac{-1}{8 \theta^2}\sum_{i=1}^n (x_i - \theta)^2)\\ &=\exp \left(\ln(8\pi \theta^2)^{-n/2}- \frac{1}{8 \theta^2}\sum_{i=1}^n x_i^2 + \frac{1}{4 \theta} \sum_{i=1}^n x_i - \frac{n}{8}\right) \end{align*}

So clearly this is not a member of the exponential family as it is the representation of a two dimensional exponential family, but we only have one parameter.

I am struggling to find a sufficient statistic however, can I have a two dimensional statistic if I am estimating one parameter?

Update

So after doing a similar question I am fairly certain that a sufficient statistic is given by: $S=(S_1,S_2) =(\sum_{i=1}^n x_i^2,\sum_{i=1}^n x_i) $. So i guess my question just boils down to how can we have a two dimensional statistic to estimate one parameter, seems counter intuitive?

Also, I've learned that this is a member of the curved exponential family, a further generalization of the exponential family.



Answers 1


First this is an exponential family since the density writes down as $$\exp\{\Phi_1(\theta) S_1({\mathbf x})+\Phi_2(\theta) S_2({\mathbf x})-\Psi(\theta)\}$$against a particular dominating measure. That the two coefficients $\Phi_1(\theta)$ and $\Phi_2(\theta)$ are connected with a functional relation is not an issue: they also both depend deterministically on $\theta$. The fact that $\theta$ is one-dimensional and the family is two-dimensional is a case of curved exponential families (see Brown, 1986). This family can be extended to a truly two-dimensional parameter space, of which the ${\cal N}(\theta, 4\theta²)$ is a special case. Or a curve like $\Psi_1=\Psi_2^2/2$ in the extended parameter space.

Another reason for this distribution to be from an exponential family is that there exists a sufficient statistic of dimension two, whatever the sample size $n$ is. By the Darmois-Pitman-Koopman lemma this can only occur in an exponential family.

For the same reason as before, there can be a sufficient statistic of dimension two and a parameter of dimension one and this is not a contradiction, as the same sufficient statistic of dimension two serves for the extended exponential family with two parameters. Examples (or paradoxes) where this happens abound in the literature. See for instance Romano and Siegel (1987). As pointed out by Kjetil B Halvorsen, these "paradoxes" are generally connected with a lack of completeness.

Xi'an
Xi'an
April 18, 2018 09:39 AM

Related Questions


complete sufficient statistic exercise

Updated April 19, 2017 17:19 PM

Parametric family problems

Updated January 13, 2018 16:19 PM