Summary
In this chapter, we leveraged the power of the AWS ecosystem to build a real-time streaming data classification pipeline. Our pipeline was able to classify streaming tweets using an Amazon ML classification endpoint. The AWS data ecosystem is diverse and complex, and for a given problem, there are often several possible solutions and architectures. The Kinesis-Lambda-Redshift-MachineLearning architecture we built is simple, yet very powerful.
The true strength of the Amazon ML service lies in its ease of use and simplicity. Training and evaluating a model from scratch can be done in a few minutes with a few clicks, and it can result in very good performances. Using the AWS CLI and the SDK, more complex data flows and model explorations can easily be implemented. The service is agile enough to become a part of a wider data flow by providing real-time classification and regression.
Underneath the simple interface, the machine learning expertise of Amazon shines at many levels. From...