My team doesn't write unit tests for generated code (e.g. some POJOs). An engineer that I greatly respect recently wrote this on the subject:
Research over the past 10 years has revealed that generated code is as prone to bugs as other code, and in many cases the bugs in generated code are far more costly to triage. Hence, the common recommendation regarding generated code is that it should be unit tested either by:
Generated unit tests (provided along-side the generated code)
Unit tests provided by the integration code
Most of the time #1 isn't available, but #2 above is usually intrinsic to the unit tests written for classes utilizing the generated code, and well written unit tests for the consuming/extended classes results in high coverage of the generated code.
This recommendation has been universally (more-or-less) picked up by the code coverage community, as evidenced by the fact that Cobertura's replacement, JaCoCo, does not even provide exclusions as an option.
I haven't been able to find research on this subject, but then again I'm not deep in to testing. Are there data on this question? If so, what are the implications?
No, you should not test generated code. And you really shouldn't test POJOs - there's no behavior there to test.
But you should probably test your code generation to make sure it does the right things, handles corner cases, handles bad data, etc. If the codegen tool is not yours and your inputs to it are simple, then maybe some spot checks or manual tests are sufficient.
And since you're probably generating POJOs to map to some other interface, you should probably do (possibly manual) integration tests to make sure that your serialization or other motivation for having the POJOs is satisfied by the things you created.
It's maybe a bit of pedantry, but if you focus on just the output of the tool you're likely to make bad tests that just assert that the output is what you expect, not what you need.
There is no straight answer to this. There is generated code, and then there is generated code. We could be talking about a bit of framework code, initializations for something you dragged together in a designer, or about an automatic conversion of a full blown application from one language/platform to another language/platform. In the latter case you would certainly want to test to find any change in behavior because yes, there would be behavior involved.
Code generation has many applications. Testing may be appropriate and necessary or it may be pointless.
Testing what you have done is always good, but you should test that your input to the generator makes sense, not that the generation result is correct. Generated unit-tests can only do the first, so that makes no sense. Integration or better "sociable" unit-tests are a much better solution.
The reason to use code generation or a framework is abstraction: You want to avoid writing boilerplate code like database schemas, or serialization code, and express your application design on a higher level. (Think of domain specific languages, where you do not want to touch the actual programming language at all, but only stay on your domain level.)
Test should not break this abstraction by trying to "look inside" the components. If your abstraction is sound and complete, there should be a way of expressing the expectation you have on the same level. If not your abstraction is maybe broken.
Taking a database mapping for example you want to abstract from persistency. So attaching an in-memory database, or a fake database, and running all your unit-test against it is perfectly fine. Though this is not the most common interpretation of unit-tests, (Martin Fowler calls them "sociable" unit-tests) the advantage is, that your generated code or framework code is run every time you test.
Of course this can also be done on a integration test level, but for things like persistency this would require far too many tests on a even higher level.
Do you need to test the POJO's?
Do you need to test the output from the Generator Program?
That the Generator Program produces other code for which you may or may not choose to write tests is irrelevant. If someone change the Generator Program, you have to be sure that what comes out doesn't break anything.