In this problem we will investigate the t-statistic for the null hypothesis \(H_0 : \beta = 0\) in simple linear regression without an intercept.
Some of the exercises are not tested by Dodona (for example the plots), but it is still useful to try them.
To begin, we generate a predictor \(x\) and a response \(y\) as follows.
set.seed(1)
x <- rnorm(100)
y <- 2 * x + rnorm(100)
Perform a simple linear regression with dependent variable \(y\) and independent variable \(x\), without an intercept. Store the model in lm.fit1
.
Report the coefficient estimate \(\hat{\beta}\), the standard error of this coefficient estimate,
and the t-statistic and p-value associated with the null hypothesis \(H_0\) in the variables in beta.hat1
, se1
, t.stat1
and p.value1
.
(You can perform regression without an intercept using the command lm(y ~ 0 + x)
and you can find the exact values of the beta coefficients, standard errors… with summary(lm.fit)$coefficients
)
Now perform a simple linear regression with dependent variable \(x\) and independent variable \(y\), without an intercept. Store the model in lm.fit2
.
Report the coefficient estimate \(\hat{\beta}\), the standard error of this coefficient estimate,
and the t-statistic and p-value associated with the null hypothesis \(H_0\) in the variables in beta.hat2
, se2
, t.stat2
and p.value2
.
For the regression of \(Y\) onto \(X\) without an intercept, the t-statistic for \(H_0 : \beta = 0\) takes the form \(\hat{\beta}/SE(\hat{\beta})\), where \(\hat{\beta}\) is given by \(\hat\beta=\left ( \sum_{i=1}^{n}x_i y_i \right )/\left ( \sum_{i=1}^{n}x_{i'}^2 \right )\) and where
\[SE(\hat{\beta}) = \sqrt{\frac{\sum_{i=1}^n(y_i - x_i\hat{\beta})^2}{(n - 1)\sum_{i=1}^nx_i^2}}\]Show algebraically, and confirm numerically in R, that the t-statistic can be written as
\[\frac{\sqrt{n - 1}\sum_{i=1}^nx_iy_i}{\sqrt{(\sum_{i=1}^nx_i^2)(\sum_{i=1}^ny_i^2) - (\sum_{i=1}^nx_iy_i)}}\]Using the results from 4., argue that the t-statistic for the regression of \(y\) onto \(x\) is the same t-statistic for the regression of \(x\) onto \(y\).
lm.fit3
and lm.fit4
, respectively.
Store the t-statistics for \(H_0 : \beta_1 = 0\) in t.stat3
and t.stat4
, respectively.