The physical functioning dataset
This chapter will use data from the National Health and Nutrition Examination Survey (NHANES). This survey collects data in two year cycles for about 10,000 civilian community-dwelling Americans and is available on the CDC website at http://www.cdc.gov/nchs/nhanes/nhanes_questionnaires.htm. The cleaned dataset that we will use in this chapter is available on the Internet at http://scholar.harvard.edu/gerrard/mastering-scientific-computation-r. Please see the terms of use at the CDC's website. We will start by loading the dataset and creating a matrix from the data frame:
phys.func <- read.csv('phys_func.txt')[,c(-1)] phys.func.mat <- as.matrix(phys.func)
This dataset includes cleaned data from the years 2003 to 2010 with all missing values removed. Answers that were not missing but answered as unknown or refused have also been removed. Survey respondents are asked how much difficulty they have with the following 20 items (questions are paraphrased here...