Book Image

Python Data Analysis - Third Edition

By : Avinash Navlani, Ivan Idris
5 (1)
Book Image

Python Data Analysis - Third Edition

5 (1)
By: Avinash Navlani, Ivan Idris

Overview of this book

Data analysis enables you to generate value from small and big data by discovering new patterns and trends, and Python is one of the most popular tools for analyzing a wide variety of data. With this book, you’ll get up and running using Python for data analysis by exploring the different phases and methodologies used in data analysis and learning how to use modern libraries from the Python ecosystem to create efficient data pipelines. Starting with the essential statistical and data analysis fundamentals using Python, you’ll perform complex data analysis and modeling, data manipulation, data cleaning, and data visualization using easy-to-follow examples. You’ll then understand how to conduct time series analysis and signal processing using ARMA models. As you advance, you’ll get to grips with smart processing and data analytics using machine learning algorithms such as regression, classification, Principal Component Analysis (PCA), and clustering. In the concluding chapters, you’ll work on real-world examples to analyze textual and image data using natural language processing (NLP) and image analytics techniques, respectively. Finally, the book will demonstrate parallel computing using Dask. By the end of this data analysis book, you’ll be equipped with the skills you need to prepare data for analysis and create meaningful data visualizations for forecasting values from data.
Table of Contents (20 chapters)
1
Section 1: Foundation for Data Analysis
6
Section 2: Exploratory Data Analysis and Data Cleaning
11
Section 3: Deep Dive into Machine Learning
15
Section 4: NLP, Image Analytics, and Parallel Computing

Using IPython as a shell

IPython is an interactive shell that is equivalent to an interactive computing environment such as Matlab or Mathematica. This interactive shell was created for the purpose of quick experimentation. It is a very useful tool for data professionals that are performing small experiments.

IPython shell offers the following features:

  • Easy access to system commands.
  • Easy editing of inline commands.
  • Tab completion, which helps you find commands and speed up your task.
  • Command History, which helps you view previously used commands.
  • Easily execute external Python scripts.
  • Easy debugging with the Python debugger.

Now, let's execute some commands on IPython. To start IPython, use the following command on the command line:

$ ipython3

When you run the preceding command, the following window will appear:

Now, let's understand and execute some commands that the IPython shell provides:

  • History Commands: The history command used to check the list of previously used commands. The following screenshot shows how to use the history command in IPython:
  • System Commands: We can also run system commands from IPython using the exclamation sign (!). Here, the input command after the exclamation sign is considered a system command. For example, !date will display the current date of the system, while !pwd will show the current working directory:
  • Writing Function: We can write functions as we would write them in any IDE, such as Jupyter Notebook, Python IDLE, PyCharm, or Spyder. Let's look at an example of a function:
  • Quit Ipython Shell: You can exit or quit the IPython shell using quit() or exit() or CTRL + D:

You can also quit the IPython shell using the quit() command:

In this subsection, we have looked at a few basic commands we can use on the IPython shell. Now, let's discuss how we can use the help command in the IPython shell.

Reading manual pages

In the IPython shell, we can open a list of available commands using the help command. It is not compulsory to write the full name of the function. You can just type in a few initial characters and then press the tab button, and it will find the word you are looking for. For example, let's use the arrange() function. There are two ways we can find help about functions:

  • Use the help function: Let's type help and write a few initial characters of the function. After that, press the tab key, select a function using the arrow keys, and press the Enter key:
  • Use a question mark: We can also use a question mark after the name of the function. The following screenshot shows an example of this:

In this subsection, we looked at the help and question mark support that's provided for module functions. We can also get help from library documentation. Let's discuss how to get documentation for data analysis in Python libraries.

Where to find help and references to Python data analysis libraries

The following table lists the documentation websites for the Python data analysis libraries we have discussed in this chapter:

Packages/Software

Description

NumPy

https://numpy.org/doc/

SciPy

https://docs.scipy.org/doc/

Pandas

https://pandas.pydata.org/docs/

Matplotlib

https://matplotlib.org/3.2.1/contents.html

Seaborn

https://seaborn.pydata.org/

Scikit-learn

https://scikit-learn.org/stable/

Anaconda

https://www.anaconda.com/distribution/

You can also find answers to various Python programming questions related to NumPy, SciPy, Pandas, Matplotlib, Seaborn, and Scikit-learn on the StackOverflow platform. You can also raise issues related to the aforementioned libraries on GitHub.