Book Image

Python Data Analysis - Third Edition

By : Avinash Navlani, Ivan Idris
5 (1)
Book Image

Python Data Analysis - Third Edition

5 (1)
By: Avinash Navlani, Ivan Idris

Overview of this book

Data analysis enables you to generate value from small and big data by discovering new patterns and trends, and Python is one of the most popular tools for analyzing a wide variety of data. With this book, you’ll get up and running using Python for data analysis by exploring the different phases and methodologies used in data analysis and learning how to use modern libraries from the Python ecosystem to create efficient data pipelines. Starting with the essential statistical and data analysis fundamentals using Python, you’ll perform complex data analysis and modeling, data manipulation, data cleaning, and data visualization using easy-to-follow examples. You’ll then understand how to conduct time series analysis and signal processing using ARMA models. As you advance, you’ll get to grips with smart processing and data analytics using machine learning algorithms such as regression, classification, Principal Component Analysis (PCA), and clustering. In the concluding chapters, you’ll work on real-world examples to analyze textual and image data using natural language processing (NLP) and image analytics techniques, respectively. Finally, the book will demonstrate parallel computing using Dask. By the end of this data analysis book, you’ll be equipped with the skills you need to prepare data for analysis and create meaningful data visualizations for forecasting values from data.
Table of Contents (20 chapters)
1
Section 1: Foundation for Data Analysis
6
Section 2: Exploratory Data Analysis and Data Cleaning
11
Section 3: Deep Dive into Machine Learning
15
Section 4: NLP, Image Analytics, and Parallel Computing

Installing Python 3

The installer file for installing Python 3 can easily be downloaded from the official website (https://www.python.org/downloads/) for Windows, Linux, and Mac 32-bit or 64-bit systems. The installer can be installed by double-clicking on it. This installer also has an IDE named "IDLE" that can be used for development. We will dive deeper into each of the operating systems in the next few sections.

Python installation and setup on Windows

This book is based on the latest Python 3 version. All the code that will be used in this book is written in Python 3, so we need to install Python 3 before we can start coding. Python is an open source, distributed, and freely available language. It is also licensed for commercial use. There are many implementations of Python, including commercial implementations and distributions. In this book, we will focus on the standard Python implementation, which is guaranteed to be compatible with NumPy.

You can download Python 3.9.x from the Python official website: https://www.python.org/downloads/. Here, you can find installation files for Windows, Linux, Mac OS X, and other OS platforms. You can find instructions for installing and using Python for various operating systems at https://docs.python.org/3.7/using/index.html.

You need to have Python 3.5.x or above installed on your system. The sunset date for Python 2.7 was moved from 2015 to 2020, but at the time of writing, Python 2.7 will not be supported and maintained by the Python community.

At the time of writing this book, we had Python 3.8.3 installed as a prerequisite on our Windows 10 virtual machine: https://www.python.org/ftp/python/3.8.3/python-3.8.3.exe.

Python installation and setup on Linux

Installing Python on Linux is significantly easier compared to the other OSes. To install the foundational libraries, run the following command-line instruction:

$ pip3 install numpy scipy pandas matplotlib jupyter notebook

It may be essential to run the sudo command before the preceding command if you don't have sufficient rights on the machine that you are using.

Python installation and setup on Mac OS X with a GUI installer

Python can be installed via the installation file from the Python official website. The installer file can be downloaded from its official web page (https://www.python.org/downloads/mac-osx/) for macOS. This installer also has an IDE named "IDLE" that can be used for development.

Python installation and setup on Mac OS X with brew

For Mac systems, you can use the Homebrew package manager to install Python. It will make it easier to install the required applications for developers, researchers, and scientists. The brew install command is used to install another application, such as installing python3 or any other Python package, such as NLTK or SpaCy.

To install the most recent version of Python, you need to execute the following command in a Terminal:

$ brew install python3

After installation, you can confirm the version of Python you've installed by running the following command:

$ python3 --version
Python 3.7.4

You can also open the Python Shell from the command line by running the following command:

$ python3

Now that we know how to install Python on our system, let's dive into the actual tools that we will need to start data analysis.