Book Image

Python Data Analysis - Third Edition

By : Avinash Navlani, Ivan Idris
5 (1)
Book Image

Python Data Analysis - Third Edition

5 (1)
By: Avinash Navlani, Ivan Idris

Overview of this book

Data analysis enables you to generate value from small and big data by discovering new patterns and trends, and Python is one of the most popular tools for analyzing a wide variety of data. With this book, you’ll get up and running using Python for data analysis by exploring the different phases and methodologies used in data analysis and learning how to use modern libraries from the Python ecosystem to create efficient data pipelines. Starting with the essential statistical and data analysis fundamentals using Python, you’ll perform complex data analysis and modeling, data manipulation, data cleaning, and data visualization using easy-to-follow examples. You’ll then understand how to conduct time series analysis and signal processing using ARMA models. As you advance, you’ll get to grips with smart processing and data analytics using machine learning algorithms such as regression, classification, Principal Component Analysis (PCA), and clustering. In the concluding chapters, you’ll work on real-world examples to analyze textual and image data using natural language processing (NLP) and image analytics techniques, respectively. Finally, the book will demonstrate parallel computing using Dask. By the end of this data analysis book, you’ll be equipped with the skills you need to prepare data for analysis and create meaningful data visualizations for forecasting values from data.
Table of Contents (20 chapters)
1
Section 1: Foundation for Data Analysis
6
Section 2: Exploratory Data Analysis and Data Cleaning
11
Section 3: Deep Dive into Machine Learning
15
Section 4: NLP, Image Analytics, and Parallel Computing

Advanced features of Jupyter Notebooks

Jupyter Notebook offers various advanced features, such as keyboard shortcuts, installing other kernels, executing shell commands, and using various extensions for faster data analysis operations. Let's get started and understand these features one by one.

Keyboard shortcuts

Users can find all the shortcut commands that can be used inside Jupyter Notebook by selecting the Keyboard Shortcuts option in the Help menu or by using the Cmd + Shift + P shortcut key. This will make the quick select bar appear, which contains all the shortcuts commands, along with a brief description of each. It is easy to use the bar and users can use it when they forget something:

Installing other kernels

Jupyter has the ability to run multiple kernels for different languages. It is very easy to set up an environment for a particular language in Anaconda. For example, an R kernel can be set by using the following command in Anaconda:

$ conda install -c r r-essentials

The R kernel should then appear, as shown in the following screenshot:

Running shell commands

In Jupyter Notebook, users can run shell commands for Unix and Windows. The shell offers a communication interface for talking with the computer. The user needs to put ! (an exclamation sign) before running any command:

Extensions for Notebook

Notebook extensions (or nbextensions) add more features compared to basic Jupyter Notebooks. These extensions improve the user's experience and interface. Users can easily select any of the extensions by selecting the NBextensions tab.

To install nbextension in Jupyter Notebook using conda, run the following command:

conda install -c conda-forge jupyter_nbextensions_configurator

To install nbextension in Jupyter Notebook using pip, run the following command:

pip install jupyter_contrib_nbextensions && jupyter contrib nbextension install

If you get permission errors on macOS, just run the following command:

pip install jupyter_contrib_nbextensions && jupyter contrib nbextension install --user

All the configurable nbextensions will be shown in a different tab, as shown in the following screenshot:

Now, let's explore a few useful features of Notebook extensions:

  • Hinterland: This provides an autocompleting menu for each keypress that's made in cells and behaves like PyCharm:
  • Table of Contents: This extension shows all the headings in the sidebar or navigation menu. It is resizable, draggable, collapsible, and dockable:

  • Execute Time: This extension shows when the cells were executed and how much time it will take to complete the cell code:
  • Spellchecker: Spellchecker checks and verifies the spellings that are written in each cell and highlights any incorrectly written words.
  • Variable Selector: This extension keeps track of the user's workspace. It shows the names of all the variables that the user created, along with their type, size, shape, and value.
  • Slideshow: Notebook results can be communicated via Slideshow. This is a great tool for telling stories. Users can easily convert Jupyter Notebooks into slides without the use of PowerPoint. As shown in the following screenshot, Slideshow can be started using the Slideshow option in the cell toolbar of the view menu:

Jupyter Notebook also allows you to show or hide any cell in Slideshow. After adding the Slideshow option to the cell toolbar of the view menu, you can use a Slide Type drop-down list in each cell and select various options, as shown in the following screenshot:

  • Embedding PDF documents: Jupyter Notebook users can easily add PDF documents. The following syntax needs to be run for PDf documents:
from IPython.display import IFrame
IFrame('https://arxiv.org/pdf/1811.02141.pdf', width=700, height=400)

This results in the following output:

  • Embedding Youtube Videos: Jupyter Notebook users can easily add YouTube videos. The following syntax needs to be run for adding YouTube videos:
from IPython.display import YouTubeVideo
YouTubeVideo('ukzFI9rgwfU', width=700, height=400)

This results in the following output:

With that, you now understand data analysis, the process that's undertaken by it, and the roles that it entails. You have also learned how to install Python and use Jupyter Lab and Jupyter Notebook. You will learn more about various Python libraries and data analysis techniques in the upcoming chapters.