I already introduced statistical terminology, which is also used in data science: you analyze cases using their variables. In a RDBMS database, a case is represented as a row in a table and a variable as a column in the same table. In Python and R, you analyze data frames, which are similar to tables, just with positional access.
The first thing you need to decide is what the case is you want to analyze. Sometimes, it is not so easy to exactly define your case. For example, if you're performing a credit risk analysis, you might define a family as a case rather than a single customer.
The next thing you have to understand is how data values are measured in your data set. A typical data science team should include a subject matter expert that can explain the meaning of the variable values. There are several different types of variables...