Manipulating feature columns of the dataframe
The majority of the time, your data processing activities will mostly involve manipulating the columns of the dataframes. Most importantly, the type of values in the column and the ordering of the values in the column will play a major role in model training.
H2O provides some functionalities that help you do so. The following are some of the functionalities that help you handle missing values in your dataframe:
- Sorting of columns
- Changing the type of the column
Let’s first understand how we can sort a column using H2O.
Sorting columns
Ideally, you want the data in a dataframe to be shuffled before passing it off to model training. However, there may be certain scenarios where you might want to re-order the dataframe based on the values in a column.
H2O has a functionality called sort()
to sort dataframes based on the values in a column. It has the following parameters:
by
: The column to sort...