by Gerardo Durán Martín
Last Updated August 14, 2019 17:19 PM

I am currently reading chapter 5 from Christopher Bishop's Pattern Recognition and Machine Learning book.

Having an infinte dataset $\{({\bf x}, t)\}$, and an estimated value $y({\bf x})$, I am attempting to compute the squared error function over an expanded dataset driven by the transformation of the input vector ${\bf x}$ given by ${\bf s}({\bf x}, \xi)$, defined in such a way that ${\bf s}({\bf x}, 0) = {\bf x}$. The form this error functions takes is

$$ \tilde E = \frac{1}{2}\int\int\int \left[y({\bf s}({\bf x}, \xi)) - t\right]^2 p({\bf x})p(t|{\bf x})p(\xi) d{\bf x} \ dt \ d\xi $$

where we assume that $\mathbb{E}[\xi] = 0$ and $\mathbb{V}[\xi]$ is small.

According to Bishop, the transformed error function can be expanded as

$$ \begin{align} \tilde E &= \frac{1}{2}\int\int[y({\bf x}) - t]^2 p({\bf x})p(t|{\bf x}) d{\bf x} \ dt \\ &+ \frac{1}{2}\mathbb{E}[\xi^2]\int\int \left[[y({\bf x}) - t]^2 \left[({\tau'})^T\nabla_{\bf x} y + \tau^T \nabla^2_{\bf x}y \tau\right]\\ + (\tau^T\nabla_{\bf x} y)^2 p({\bf x})p(t|{\bf x}) d{\bf x} \ dt\right] \end{align} $$

Where we consider the second-order Taylor expansion of y(${\bf s}({\bf x}, \xi)$) given by $$ y({\bf s}({\bf x}, \xi)) = y({\bf x}) + \xi \tau^T\nabla_xy + \frac{\xi^2}{2}\left((\tau')^T\nabla_xy + \tau^T\nabla_x^2\tau\right) + O(\xi^3) $$

**My attempt**

With the second-order Taylor expansion of y(${\bf s}({\bf x}, \xi)$), we might rewrite the integrand as

$$ \begin{align} & \left([y({\bf s}({\bf x}) - t] + \xi\tau^T\nabla_xy + \frac{\xi^2}{2}\left((\tau')^T\nabla_x y + \tau^T\nabla_x^2y\tau\right) + O(\xi^3) \right)^2 \\ & = [y({\bf x}) - t]^2 \\ &+ 2(y({\bf x}) - t)\left( \xi\tau^T\nabla_xy + \frac{\xi^2}{2}\left((\tau')^T\nabla_x y + \tau^T\nabla_x^2y\tau\right) + O(\xi^3)\right) \\ &+ \left(\xi\tau^T\nabla_xy + \frac{\xi^2}{2}\left((\tau')^T\nabla_x y + \tau^T\nabla_x^2y\tau\right) + O(\xi^3)\right)^2 \end{align} $$

Where we see that the first term in the summation, integrated w.r.t. ${\bf x}$, $t$ and $\xi$ is the first term in $\tilde E$; the second term in the summation decomposes into 0 (because of the constraint $\mathbb{E}[\xi] = 0$) plus the second term of $\tilde E$ plus $O(\xi^3)$; finally, we are left with the third term, which is where I am having some trouble.

expanding the third term of the expansion we get $$ \begin{align} &\left(\xi\tau^T\nabla_xy + \frac{\xi^2}{2}\left((\tau')^T\nabla_x y + \tau^T\nabla_x^2y\tau\right) + O(\xi^3)\right)^2\\ &= \xi^2\left(\tau^T\nabla_xy\right)^2 \\ &+ 2 \left(\xi\tau^T\nabla_xy\right) \left(\frac{\xi^2}{2}\left((\tau')^T\nabla_x y + \tau^T\nabla_x^2y\tau\right) + O(\xi^3)\right) \\ &+ \left(\frac{\xi^2}{2}\left((\tau')^T\nabla_x y + \tau^T\nabla_x^2y\tau\right) + O(\xi^3)\right)^2 \end{align} $$

Where is clear to see that the first term in this last expansion corresponds to the final term of $\tilde E$, but we are left to solve the second and third terms which contain $\mathbb{E}[\xi^3]$ and $\mathbb{E}[\xi^4]$ respectively. If the expansion of $\tilde E$ is correct, I would expect these two terms to be zero, but it is not clear to me how to go about doing this.

- ServerfaultXchanger
- SuperuserXchanger
- UbuntuXchanger
- WebappsXchanger
- WebmastersXchanger
- ProgrammersXchanger
- DbaXchanger
- DrupalXchanger
- WordpressXchanger
- MagentoXchanger
- JoomlaXchanger
- AndroidXchanger
- AppleXchanger
- GameXchanger
- GamingXchanger
- BlenderXchanger
- UxXchanger
- CookingXchanger
- PhotoXchanger
- StatsXchanger
- MathXchanger
- DiyXchanger
- GisXchanger
- TexXchanger
- MetaXchanger
- ElectronicsXchanger
- StackoverflowXchanger
- BitcoinXchanger
- EthereumXcanger