Using DataFrames
If you measure n
variables (each of a different type) of a single object, then you get a table with n
columns for each object row. If there are m
observations, then we have m
rows of data. For example, given the student grades as data, you might want to know compute the average grade for each socioeconomic group
, where grade
and socioeconomic group
are both columns in the table, and there is one row per student.
DataFrame
is the most natural representation to work with such a (m x n) table of data. They are similar to Pandas DataFrames
in Python or data.frame
in R. DataFrame
is a more specialized tool than a normal array for working with tabular and statistical data, and it is defined in the DataFrames
package, a popular Julia library for statistical work. Install it in your environment by typing in add DataFrames
in the REPL. Then, import it into your current workspace with using DataFrames
. Do the same for the DataArrays
and RDatasets
packages (which contain a collection...