Linear regression summary in R: Standard vs car.Anova

by patrick   Last Updated August 14, 2019 20:19 PM

I am running some linear regressions in R. I am dealing with a linear dependent and linear as well as categorical independent variables using lm. So far, I have looked at the output that summary(model) gives me.

Other studies instead run Anova() from the car package on their linear model, which returns a similar table. The docs for Anova() state that it

Calculates type-II or type-III analysis-of-variance tables for model objects.

I am under the impression that this Anova() returns an F instead of the t-statistic but is ~ equivalent in what its tell me. (sample output below). So I was wondering

  • Are standard R summary(lm) and car Anova(lm) indeed doing pretty much the same calculations here? If not, what is the difference?

  • They both report the same p-value, however the F-statistic at the bottom of the standard output is different from the Anova() one. Why is that?

  • What are applications where one would choose one over the other?

Any help is much appreciated!

Sample output:

Standard R

summary(linreg)
...
         Estimate    t value    Pr(>|t|)
Age      -18.016     -3.917     0.000107
Gender   -45.4912    -4.916     1.35e-06
---
Residual standard error: 85.81 on 359 degrees of freedom
F-statistic: 16.71 on 2 and 359 DF, p-value: 1.147e-07

Anova() output

Anova(linreg)

Anova Table (Type II tests)

           Sum Sq    F value   Pr (>F)
Age        112997    15.345    0.0001072
Gender     1777936   24.164    1.348e-06


Answers 1


The tests are not generally the same, but in this case they are and will be for any two parameter model. This is because the t-test on the summary table compares the the full model $$ y = \beta_0 + \beta_1 x ...(1)$$ with the model that $y = \beta_1 x$ for testing $\beta_0 = 0 $ and then for testing $\beta_1 = 0 $ it compares $y = \beta_0 $ with (1). While the F test in the anova table compares $y = 0$ with $y = \beta_0$ for testing that $\beta_0 = 0$ and then compares $y= \beta_0$ with $y = \beta_0 +\beta_1 x$ for testing that $\beta_1 = 0$. Thus the two tests here gives almost the same results, but for the case where there are multiple explanatory variables the results will differ, and the difference will be more apparent if the variables are correlated, another difference is that the anova table F-test may differ with different ordering of the explanatory variables thus anova table is more preferable if the is suspect of correlation between the explanatory variables (suspect that they may explain the same variation in the response), while the t-test is best as an first step of assessment , hope that answers all your questions.

Johannes
Johannes
June 02, 2016 01:53 AM

Related Questions





Deciding between linear models using ANOVA

Updated May 11, 2017 14:19 PM