Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Free Learning
Arrow right icon
Arrow up icon
GO TO TOP
Hands-On Deep Learning with Apache Spark

You're reading from   Hands-On Deep Learning with Apache Spark Build and deploy distributed deep learning applications on Apache Spark

Arrow left icon
Product type Paperback
Published in Jan 2019
Publisher Packt
ISBN-13 9781788994613
Length 322 pages
Edition 1st Edition
Languages
Arrow right icon
Author (1):
Arrow left icon
Guglielmo Iozzia Guglielmo Iozzia
Author Profile Icon Guglielmo Iozzia
Guglielmo Iozzia
Arrow right icon
View More author details
Toc

Table of Contents (19) Chapters Close

Preface 1. The Apache Spark Ecosystem FREE CHAPTER 2. Deep Learning Basics 3. Extract, Transform, Load 4. Streaming 5. Convolutional Neural Networks 6. Recurrent Neural Networks 7. Training Neural Networks with Spark 8. Monitoring and Debugging Neural Network Training 9. Interpreting Neural Network Output 10. Deploying on a Distributed System 11. NLP Basics 12. Textual Analysis and Deep Learning 13. Convolution 14. Image Classification 15. What's Next for Deep Learning? 16. Other Books You May Enjoy Appendix A: Functional Programming in Scala 1. Appendix B: Image Data Preparation for Spark

Raw data transformation with Spark

Data coming from a source is often raw data. When we talk about raw data we mean data that is in a format that can't be used as is for the training or testing purposes of our models. So, before using, we need to make it tidy. The cleanup process is done through one or more transformations before giving the data as input for a given model.

For data transformation purposes, the DL4J DataVec library and Spark provide several facilities. Some of the concepts described in this section have been explored in the Data ingestion through DataVec and transformation through Spark section, but now we are going to add a more complex use case.

To understand how to use Datavec for transformation purposes, let's build a Spark application for web traffic log analysis. The dataset used is generally available for download at the MonitorWare website (http...

lock icon The rest of the chapter is locked
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $19.99/month. Cancel anytime
Banner background image