My question: Is there a formula to calculate analytically the expected number of "exceptions" which the simulator generates, as explained in the passage below?
For most of my adult life I've been explaining to people that "exceptions," meaning examples running counter to the trend, do not disprove claims that there's a correlation between two variables.
To get a good idea of the percentage of examples which will be "exceptions" to a correlation, check out the correlation simulator at rlanders.net/correlation.html.
For a positive correlation, the "exceptions" are the dots in the upper left and lower right quadrants. As the correlation coefficient is set to higher values, the number of exceptions will tend to decrease. But, as is easily seen, even at high values there are more "exceptions" than you may think.
Partially answered in comments:
The geometric analysis in my post can be used to show the expected proportion of exceptions when the data are bivariate Normal (which is what this calculator assumes) is arccos(ρ)/π. – whuber