Computing the error function defined over a transformed dataset

by Gerardo Durán Martín   Last Updated August 14, 2019 17:19 PM

I am currently reading chapter 5 from Christopher Bishop's Pattern Recognition and Machine Learning book.

Having an infinte dataset $\{({\bf x}, t)\}$, and an estimated value $y({\bf x})$, I am attempting to compute the squared error function over an expanded dataset driven by the transformation of the input vector ${\bf x}$ given by ${\bf s}({\bf x}, \xi)$, defined in such a way that ${\bf s}({\bf x}, 0) = {\bf x}$. The form this error functions takes is

$$ \tilde E = \frac{1}{2}\int\int\int \left[y({\bf s}({\bf x}, \xi)) - t\right]^2 p({\bf x})p(t|{\bf x})p(\xi) d{\bf x} \ dt \ d\xi $$

where we assume that $\mathbb{E}[\xi] = 0$ and $\mathbb{V}[\xi]$ is small.

According to Bishop, the transformed error function can be expanded as

$$ \begin{align} \tilde E &= \frac{1}{2}\int\int[y({\bf x}) - t]^2 p({\bf x})p(t|{\bf x}) d{\bf x} \ dt \\ &+ \frac{1}{2}\mathbb{E}[\xi^2]\int\int \left[[y({\bf x}) - t]^2 \left[({\tau'})^T\nabla_{\bf x} y + \tau^T \nabla^2_{\bf x}y \tau\right]\\ + (\tau^T\nabla_{\bf x} y)^2 p({\bf x})p(t|{\bf x}) d{\bf x} \ dt\right] \end{align} $$

Where we consider the second-order Taylor expansion of y(${\bf s}({\bf x}, \xi)$) given by $$ y({\bf s}({\bf x}, \xi)) = y({\bf x}) + \xi \tau^T\nabla_xy + \frac{\xi^2}{2}\left((\tau')^T\nabla_xy + \tau^T\nabla_x^2\tau\right) + O(\xi^3) $$

My attempt

With the second-order Taylor expansion of y(${\bf s}({\bf x}, \xi)$), we might rewrite the integrand as

$$ \begin{align} & \left([y({\bf s}({\bf x}) - t] + \xi\tau^T\nabla_xy + \frac{\xi^2}{2}\left((\tau')^T\nabla_x y + \tau^T\nabla_x^2y\tau\right) + O(\xi^3) \right)^2 \\ & = [y({\bf x}) - t]^2 \\ &+ 2(y({\bf x}) - t)\left( \xi\tau^T\nabla_xy + \frac{\xi^2}{2}\left((\tau')^T\nabla_x y + \tau^T\nabla_x^2y\tau\right) + O(\xi^3)\right) \\ &+ \left(\xi\tau^T\nabla_xy + \frac{\xi^2}{2}\left((\tau')^T\nabla_x y + \tau^T\nabla_x^2y\tau\right) + O(\xi^3)\right)^2 \end{align} $$

Where we see that the first term in the summation, integrated w.r.t. ${\bf x}$, $t$ and $\xi$ is the first term in $\tilde E$; the second term in the summation decomposes into 0 (because of the constraint $\mathbb{E}[\xi] = 0$) plus the second term of $\tilde E$ plus $O(\xi^3)$; finally, we are left with the third term, which is where I am having some trouble.

expanding the third term of the expansion we get $$ \begin{align} &\left(\xi\tau^T\nabla_xy + \frac{\xi^2}{2}\left((\tau')^T\nabla_x y + \tau^T\nabla_x^2y\tau\right) + O(\xi^3)\right)^2\\ &= \xi^2\left(\tau^T\nabla_xy\right)^2 \\ &+ 2 \left(\xi\tau^T\nabla_xy\right) \left(\frac{\xi^2}{2}\left((\tau')^T\nabla_x y + \tau^T\nabla_x^2y\tau\right) + O(\xi^3)\right) \\ &+ \left(\frac{\xi^2}{2}\left((\tau')^T\nabla_x y + \tau^T\nabla_x^2y\tau\right) + O(\xi^3)\right)^2 \end{align} $$

Where is clear to see that the first term in this last expansion corresponds to the final term of $\tilde E$, but we are left to solve the second and third terms which contain $\mathbb{E}[\xi^3]$ and $\mathbb{E}[\xi^4]$ respectively. If the expansion of $\tilde E$ is correct, I would expect these two terms to be zero, but it is not clear to me how to go about doing this.



Related Questions


Any interesting stats from hand washing data?

Updated February 19, 2017 00:19 AM