One of the great advantages of the bootstrap approach is that it can be applied in almost all situations. No complicated mathematical calculations are required. Performing a bootstrap analysis in R entails only two steps. First, we must create a function that computes the statistic of interest. Second, we use the boot() function, which is part of the boot library, to perform the bootstrap by repeatedly sampling observations from the data set with replacement.

Case

Suppose that we wish to invest a fixed sum of money in two financial assets that yield returns of \(X\) and \(Y\), respectively, where \(X\) and \(Y\) are random quantities. We will invest a fraction \(\alpha\) of our money in \(X\), and will invest the remaining \(1 − \alpha\) in \(Y\) . Since there is variability associated with the returns on these two assets, we wish to choose \(\alpha\) to minimize the total risk, or variance, of our investment. In other words, we want to minimize \({Var}(\alpha X + (1 - \alpha)Y)\). One can show that the value that minimizes the risk is given by

\[\begin{align} \alpha = \frac{\sigma_Y^2 - \sigma_{XY}}{\sigma_X^2 + \sigma_Y^2 - 2\sigma_{XY}} \end{align}\]

where \(\sigma_X^2 = Var(X)\), \(\sigma_Y^2 = Var(Y)\), and \(\sigma_{XY} = Cov(X,Y)\). In reality, the quantities \(\sigma_X^2\) , \(\sigma_Y^2\), and \(\sigma_{XY}\) are unknown. We can compute estimates for these quantities, \(\hat\sigma_X^2\), \(\hat\sigma_Y^2\), and \(\hat\sigma_{XY}\) , using a data set that contains past measurements for \(X\) and \(Y\) . We can then estimate the value of \(\alpha\) that minimizes the variance of our investment using

\[\begin{align} \hat\alpha = \frac{\hat\sigma_Y^2 - \hat\sigma_{XY}}{\hat\sigma_X^2 + \hat\sigma_Y^2 - 2\hat\sigma_{XY}} \end{align}\]

The Portfolio data set in the ISLR2 package gives us information about past measurements of \(X\) and \(Y\). To illustrate the use of the bootstrap on this data, we must first create a function, alpha.fn(), which takes as input the \((X,Y)\) data as well as a vector indicating which observations should be used to estimate \(\alpha\). The function then outputs the estimate for \(\alpha\) based on the selected observations.

alpha.fn = function(data, index) {
  X = data$X[index]
  Y = data$Y[index]
  return((var(Y) - cov(X, Y)) / (var(X) + var(Y) - 2 * cov(X, Y)))
}

This function returns, or outputs, an estimate for \(\alpha\) based on applying the formula for \(\hat\alpha\) to the observations indexed by the argument index. For instance, the following command tells R to estimate \(\alpha\) using all 100 observations.

> alpha.fn(Portfolio, 1:100)
[1] 0.5758321

The next command uses the sample() function to randomly select 100 observations from the range 1 to 100, with replacement. This is equivalent to constructing a new bootstrap data set and recomputing \(\hat\alpha\) based on the new data set.

> set.seed(1)
> alpha.fn(Portfolio,sample(100, 100, replace = T))
[1] 0.7368375

Question

Use the alpha.fn() and sample() functions to generate three estimations for \(\alpha\) and store them in a vector alpha.hat. Select 100 observations from the range 1 to 100 with replacement.

Use the code below as a starting point.

Assume that:

The ISLR2 library has been loaded
The Portfolio dataset has been loaded and attached