Exploring H2O Sparkling Water
Sparkling Water is an H2O product that combines the fast and scalable ML of H2O with the analytics capabilities of Apache Spark. The combination of both these technologies allows users to make SQL queries for data munging, feed the results to H2O for model training, build and deploy models to production, and then use them for predictions.
H2O Sparkling Water is designed in a way that you can run H2O in regular Spark applications. It has provisions to run the H2O server inside of Spark executors so that the H2O server has access to all the data stored in executors for performing any ML-based computations.
The transparent integration between H2O and Spark provides the following benefits:
- H2O algorithms, including AutoML, can be used in Spark workflows
- Application-specific data structures can be transformed and supported between H2O and Spark
- You can use Spark RDDs as datasets in H2O ML algorithms
Sparkling Water supports two...