Start by loading the NHANES
data and dplyr
package.
library(dplyr)
library(NHANES)
data(NHANES)
Compute a summary with the mean and standard deviation of the BPSysAve
variable for females, but this time for each age group separately rather than one selected decade. The age groups are defined by AgeDecade
variable. Store the result in a data frame summary_female
with a mean
and a sd
variable.
Hint
Rather than filtering by
AgeDecade
andGender
, filter byGender
and then group byAgeDecade
before summarizing.
Repeat exercise 1 for males. Store the result in summary_male
.
We can actually combine both summaries for exercises 1 and 2 into one summary. This is because group_by
permits us to group by multiple variables (group_by(dataframe, var1, var2, .., varn)
). Obtain one big summary with the blood pressure mean (mean
) and standard deviation (sd
) per age category (AgeDecade
) per gender (Gender
). Make sure the order of the variables in the group_by
function is correct. Store the result in a data frame summary_complete
.