About the Book
As machine learning algorithms become popular, new tools that optimize these algorithms are also being developed. Machine Learning Fundamentals explains the scikit-learn API, which is a package created to facilitate the process of building machine learning applications. You will learn how to explain the differences between supervised and unsupervised models, and how to apply some popular algorithms to real-life datasets.
You'll begin by learning how to use the syntax of scikit-learn. You'll study the differences between supervised and unsupervised models, as well as the importance of choosing the appropriate algorithm for each dataset. You'll apply an unsupervised clustering algorithm to real-world datasets to discover patterns and profiles, and explore the process to solve an unsupervised machine learning problem. Then, the focus of the book shifts to supervised learning algorithms. You'll learn how to implement different supervised algorithms and develop neural network structures using the scikit-learn package. You'll also learn how to perform coherent result analysis to improve the performance of the algorithm by tuning hyperparameters. By the end of this book, you will have the skills and confidence to start programming machine learning algorithms.
About the Author
After graduating from college as a business administrator, Hyatt Saleh discovered the importance of data analysis to understand and solve real-life problems. Since then, as a self-taught person, she has not only worked as a freelancer for many companies around the world in the field of machine learning, but has also founded an artificial intelligence company that aims to optimize everyday processes.
Objectives
- Understand the importance of data representation
- Gain insights into the differences between supervised and unsupervised models
- Explore data using the Matplotlib library
- Study popular algorithms, such as K-means, Mean-Shift, and DBSCAN
- Measure model performance through different metrics
- Study popular algorithms, such as Naïve Bayes, Decision Tree, and SVM
- Perform error analysis to improve the performance of the model
- Learn to build a comprehensive machine learning program
Audience
Machine Learning Fundamentals is designed for developers who are new to the field of machine learning and want to learn how to use the scikit-learn library to develop machine learning algorithms. You must have some knowledge and experience with Python programming, but you do not need any prior knowledge of scikit-learn or machine learning algorithms.
Approach
Machine Learning Fundamentals takes a hands-on approach to introduce beginners to the world of machine learning. It contains multiple activities that use real-life business scenarios for you to practice and apply your new skills in a highly relevant context.
Minimum Hardware Requirements
For the optimal student experience, we recommend the following hardware configuration:
- Processor: Intel Core i5 or equivalent
- Memory: 4 GB RAM or higher
Software Requirements
You'll also need the following software installed in advance:
- Sublime Text (latest version), Atom IDE (latest version), or other similar text editor applications
- Python 3
- The following Python libraries: NumPy, SciPy, scikit-learn, Matplotlib, Pandas, pickle, jupyter, and seaborn
Installation and Setup
Before you start this book, you'll need to install Python 3.6, pip, scikit-learn, and the other libraries used in this book. You will find the steps to install these here:
Installing Python
Install Python 3.6 by following the instructions at this link: https://realpython.com/installing-python/.
Installing pip
- To install pip, go to the following link and download the
get-pip.py
file: https://pip.pypa.io/en/stable/installing/. - Then, use the following command to install it:
python get-pip.py
You might need to use the python3 get-pip.py
command, due to previous versions of Python on your computer are already using use the python
command.
Installing libraries
Using the pip
command, install the following libraries:
python -m pip install --user numpy scipy matplotlib jupyter pandas seaborn
Installing scikit-learn
Install scikit-learn using the following command:
pip install -U scikit-learn
Installing the Code Bundle
Copy the code bundle for the class to the C:/Code
folder.
Additional Resources
The code bundle for this book is also hosted on GitHub at: https://github.com/TrainingByPackt/Machine-Learning-Fundamentals.
We also have other code bundles from our rich catalog of books and videos available at https://github.com/PacktPublishing/. Check them out!
Conventions
Code words in text, database table names, folder names, filenames, file extensions, pathnames, dummy URLs, user input, and Twitter handles are shown as follows: "Import the iris
toy dataset using scikit-learn's datasets package and store it in a variable named iris_data
."
A block of code is set as follows:
from sklearn.datasets import load_iris iris_data = load_iris()
New terms and important words are shown in bold. Words that you see on the screen, for example, in menus or dialog boxes, appear in the text like this: "Below the dataset's title, find the download section and click on Data Folder."