We will now consider the Boston housing data set, from the MASS library.

Questions

Some of the exercises are not tested by Dodona (for example the plots), but it is still useful to try them.

Based on this data set, provide an estimate for the population mean of medv, referred to as $\hat{\mu}$. Store this estimate in mu.hat.
Provide an estimate of the standard error of $\hat{\mu}$. Store this in est.sd. Interpret the result.
Now estimate the standard error of $\hat{\mu}$ using the bootstrap.
1. Create a function boot.fn.mean() that returns the coefficients of a linear regression model with coef(). The model is fitted on a dataset data with indices index. Notice that you only need to include an intercept as independent variable to find the mean.
2. Use the boot() function together with your boot.fn.mean() function to estimate the standard errors of the mean. Don’t forget to load the library boot in your R session. Set a seed of 1 before running boot() and specify that we want 10 bootstrap samples. Store the result in boot.se.mean. (Note: this command takes a few seconds to run)
3. Inspect the output in the console and manually copy the estimated standard error of the mean in the variable boot.se.mean2.
  
  It is not a good practice to hardcode quantities in variables but boot() unfortunately has no attribute that returns the desired quantity.
4. How does this compare to your answer from question 2?
Based on your bootstrap estimate from question 3, provide a 95% confidence interval for the mean of medv.

Hint: you can approximate a 95% confidence interval using the formula $[\hat{\mu} - 2SE(\hat{\mu}), \hat{\mu} + 2SE(\hat{\mu})]$.

Compare it to the results obtained using t.test(Boston$medv).
Based on this data set, provide an estimate, $\hat{\mu}_{med}$, for the median value of medv in the population. Store the result in medv.median.
We now would like to estimate the standard error of $\hat{\mu}_{med}$. Unfortunately, there is no simple formula for computing the standard error of the median. Instead, estimate the standard error of the median using the bootstrap.
1. Create a function boot.fn.median() that returns the median value for medv in a dataset data with indices index.
2. Use the boot() function together with your boot.fn.median() function to estimate the standard errors of the median. Set a seed of 1 before running boot() and specify that we want 10 bootstrap samples. Store the result in boot.se.median. (Note: this command takes a few seconds to run)
3. Inspect the output in the console and manually copy the estimated standard error of the median in the variable boot.se.median2.
Based on this data set, provide an estimate for the tenth percentile of medv in Boston suburbs, referred to as $\hat{\mu}_{0.1}$. Store the result in perc10.
Use the bootstrap to estimate the standard error of $\hat{\mu}_{0.1}$. Comment on your findings.
1. Create a function boot.fn.perc() that returns the tenth percentile for medv in a dataset data with indices index.
2. Use the boot() function together with your boot.fn.perc() function to estimate the standard errors of the tenth percentile. Set a seed of 1 before running boot() and specify that we want 10 bootstrap samples. Store the result in boot.se.perc. (Note: this command takes a few seconds to run)
3. Inspect the output in the console and manually copy the estimated standard error of the tenth percentile in the variable boot.se.perc2.
Inspect the variables boot.se.mean2, boot.se.median2, and boot.se.perc2.

MC1:
Are the estimated standard errors obtained by the two methods similar?
- 1: The estimated standard error for the tenth percentile is larger than those of the mean and the median.
- 2: The estimated standard error for the mean is larger than those of the tenth percentile and the median.
- 3: The estimated standard error for the median is larger than those of the tenth percentile and the mean.

Assume that:

The MASS library has been loaded
The Boston dataset has been loaded and attached
the boot library has been loaded