Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Free Learning
Arrow right icon
Hands-On Data Analysis with NumPy and pandas
Hands-On Data Analysis with NumPy and pandas

Hands-On Data Analysis with NumPy and pandas: Implement Python packages from data manipulation to processing

eBook
$17.99 $25.99
Paperback
$32.99
Subscription
Free Trial
Renews at $19.99p/m

What do you get with a Packt Subscription?

Free for first 7 days. $19.99 p/m after that. Cancel any time!
Product feature icon Unlimited ad-free access to the largest independent learning library in tech. Access this title and thousands more!
Product feature icon 50+ new titles added per month, including many first-to-market concepts and exclusive early access to books as they are being written.
Product feature icon Innovative learning tools, including AI book assistants, code context explainers, and text-to-speech.
Product feature icon Thousands of reference materials covering every tech concept you need to stay up to date.
Subscribe now
View plans & pricing
Table of content icon View table of contents Preview book icon Preview Book

Hands-On Data Analysis with NumPy and pandas

Setting Up a Python Data Analysis Environment

In this chapter, we will cover the following topics:

  • Installing Anaconda
  • Exploring Jupyter Notebooks
  • Exploring an alternative to Jupyter
  • Managing the Anaconda package
  • Setting up a database

In this chapter, we'll discuss installing Anaconda and managing it. Anaconda is a software package we will use in the following chapters of this book.

What is Anaconda?

In this section, we will discuss what Anaconda is and why we use it. We'll provide a link to show where to download Anaconda from the website of its sponsor, Continuum Analytics, and discuss how to install Anaconda. Anaconda is an open source distribution of the Python and R programming languages.

In this book, we'll focus on the portion of Anaconda devoted to Python. Anaconda helps us use these languages for data analysis applications, including large-scale data processing, predictive analytics, and scientific and statistical computing. Continuum Analytics provides enterprise support for Anaconda, including versions that help teams collaborate and boost the performance of their systems, along with providing a means for deploying models developed using Anaconda. Thus, Anaconda appears in enterprise settings, and aspiring analysts should be familiar with its use. Many of the packages used in this book, including Jupyter, NumPy, pandas, and many others common in data analysis, are included with Anaconda. This alone may explain its popularity.

An Anaconda installation includes most of what you need for data analysis out of the box. The Conda package manager can be used to download and installation new packages as well.

Why use Anaconda? Anaconda packages Python specifically for data analysis. The most important packages for your project are included with an Anaconda installation. With the addition of some performance boosts provided by Anaconda and Continuum Analytics' enterprise support of the package, one should not be surprised by its popularity.

Installing Anaconda

One can download Anaconda for free from the Continuum Analytics website. The link to the main download page is https://www.anaconda.com/download/; otherwise, it is easy to find. Be sure to choose the installer that is appropriate for your system. Obviously, choose the installer appropriate for your operating system, but also be aware that Anaconda comes in 32-bit and 64-bit versions. The 64-bit version provides the best performance for 64-bit systems.

The Python community is in a slow transition from Python 2.7 to Python 3.6, which is not fully backward compatible. If you need to use Python 2.7, perhaps because of legacy code or a package that has not yet been updated to work with Python 3.6, choose the Python 2.7 version of Anaconda. Otherwise, we will be using Python 3.6.

This following screenshot is from the Anaconda website, from where analysts can download Anaconda:

Anaconda website

As you can see, we can choose the Anaconda install appropriate for the OS (including Windows, macOS, and Linux), the processor, and the version of Python. Navigate to the correct OS and processor, and decide between Python 2.7 and Python 3.6.

Here, we will be using a Python 3.6. Installation on Windows, and macOS, ultimately amounts to using an install wizard that usually chooses the best options for your system, though it does allow some options that vary depending on your preferences.

The Linux install must be done via the command line, but it should not be too complicated for those who are familiar with Linux installation. It ultimately amounts to running a Bash script. Throughout this book, we will be using Windows.

Exploring Jupyter Notebooks

In this section, we will be exploring Jupyter Notebooks, the primary tool with which we will do data analysis with Python. We will see what Jupyter Notebooks are, and we will also talk about Markdown, which is what we use to create formatted text in Jupyter Notebooks. In a Jupyter Notebook, there are two types of blocks. There are blocks of Python code that are executable, and then there are formatted, human-readable text blocks.

Users execute the Python code blocks, and the results are inserted directly into the document. Code blocks can be rerun in any order without necessarily affecting later blocks, unless they are also run. Since a Jupyter Notebook is based on IPython, there's some additional functionality, for example, magic functions.

Jupyter Notebooks is included with Anaconda. Jupyter Notebooks allow plain text to be intermixed with code. Plain text can be formatted with a language called Markdown. It is done in plain text. We can also insert paragraphs. The following example is some common syntax you see in Markdown:

The following screenshot shows a Jupyter Notebook:

As you can see, it runs out of a web browser, such as Chrome or Firefox, in this case, Chrome. When we begin the Jupyter Notebook, we are in a file browser. We are in a newly created directory called Untitled Folder. In Jupyter Notebook there are options for creating new Notebooks, text files, and folders. As seen the the preceding screenshot, currently there is no Notebook saved. We will need a Python Notebook, which can be created by selecting the Python option in the New drop-down menu shown in the following screenshot:

When the Notebook has started, we begin with a code block. We can change this code block to a Markdown block, and we can now start entering text.

For example, we can enter a heading. We can also enter plain text along with bold and italics, as shown in the next screenshot:

As you can see, there is some hint of how the rendering will look at the end, but we can actually see the rendering by clicking on the run cell button. If we want to change this, we can double-click on the same cell. Now we're back to plain text editing. Here we add monotype and then click on Run cell again, shown as follows:

On pressing Enter, a new cell is immediately created afterwards. This cell is a Python cell, where we can enter Python code. For example, we can create a variable. We print Hello, world! multiple times, as shown in the next screenshot:

To see what happens when the cell is executed, we simply click on the run cell; also, when we pressed Enter, a new cell block was created. Let's make this cell block a Markdown block. If we want to insert an additional cell, we can press Insert cell below. In this first cell, we're going to enter some code, and in the second cell, we can enter code that is dependent on code in the first cell. Notice what happens when we try to execute the code in the second cell before executing the code in the first. An error will be produced, shown as follows:

The complaint, the variable trigger, has not been defined. In order for the second cell to work, we need to run this first cell. Then, when we run the second cell, we get the expected output. Now let's suppose we were to change the code in this cell; say, instead of trigger = False, we have trigger = True. This second cell will not be aware of the change. If we run this cell again, we get the same output. So we will need to run this cell first, thus affecting the change; then we can run the second cell and get the expected output.

What has happened in the background? What's going on is that there is a kernel, which is basically a running session of Python, tracking all of our variables and everything that has happened up to this point. If we click on Kernel, we can see an option to restart the kernel; this will basically restart our session of Python. We are initially warned that by restarting the kernel, all variables will be lost.

When the kernel has been restarted, it doesn't appear as if anything has changed, but if we run the second cell, an error will be produced because the variable trigger does not exist. We will need to run the previous cell first in order for this cell to work. If we want to, instead, not merely restart the kernel but restart the kernel and also rerun all cells, we need to click on Restart & Run All. After restarting the kernel, all cell blocks will be rerun. It may not appear as if anything has happened, but we have started from the first, run it, run the second cell, and then run the third cell, shown as follows:

We can also import libraries. For example, we can import a module from Matplotlib. In this case, in order for Matplotlib to work interactively in a Jupyter Notebook, we will need to use what's called a magic function, which begins with a %, the name of the magic function, and any sort of parameters we need to pass to it. We'll cover these in more detail later, but first let's run that cell block. plt has now been loaded, and now we can use it. For example, in this last cell, we will type in the following code:

Notice that the output from this cell is inserted directly into the document. We can immediately see the plot that was created. Returning to magic functions, this is not the only function that we have available. Let's see some other functions:

  • The magic function, magic, will print info about the magic system, as shown in the following screenshot:
Output of "magic" command
  • Another useful function is timeit, which we can use to profile code. We first type in timeit and then the code that we wish to profile, shown as follows:
  • The magic function pwd can be used to see what the working directory is, shown as follows:
  • The magic function cd can be used to change the working directory, shown as follows:
  • The magic function pylab is useful if we wish to start both Matplotlib and NumPy in interactive mode, shown as follows:

If we wish to see a list of available magic functions, we can type lsmagic, shown as follows:

And if we wish for a quick reference sheet, we can use the magic function quickref, shown as follows:

Now that we're done with this Notebook, let's give it a name. Let's simply call it My Notebook. This is done by clicking on the name of the Notebook at the top of the editor pane. Finally, you can save, and after saving, you can close and halt the Notebook. So this will close the Notebook and halt the Notebook's kernel. That would be the clean way to leave the Notebook. Notice now, in our tree, we can see the directory where the Notebook was saved, and we can see that the Notebook exists in that directory. It is an ipynb document.

Exploring alternatives to Jupyter

Now we will consider alternatives to Jupyter Notebooks. We will look at:

  • Jupyter QT Console
  • Spyder
  • Rodeo
  • Python interpreter
  • ptpython

The first alternative we will consider is the Jupyter QT Console; this is a Python interpreter with added functionality, aimed specifically for data analysis.

The following screenshot shows the Jupyter QT Console:

It is very similar to the Jupyter Notebook. In fact, it is effectively the Console version of the Jupyter Notebook. Notice here that we have some interesting syntax. We have In [1], and then let's suppose you were to type in a command, for example:

print ("Hello, world!")

We see some output and then we see In [2].

Now let's try something else:

1 + 1

Right after In [2], we see Out[2]. What does this mean? This is a way to track historical commands and their outputs in a session. To access, say, the command for In [42], we type _i42. So, in this case, if we want to see the input for command 2, we type in i2. Notice that it gives us a string, 1 + 1. In fact, we can run this string.

If we type in eval and then _i2, notice that it gives us the same output as the original command, In [2], did. Now, how about Out[2]? How can we access the actual output? In this case, all we would do is just _ and then the number of the output, say 2. This should give us 2. So this gives you a more convenient way to access historical commands and their outputs.

Another advantage of Jupyter Notebooks is that you can see images. For example, let's get Matplotlib running. First we're going to import Matplotlib with the following command:

import matplotlib.pyplot as plt
  

After we've imported Matplotlib, recall that we need to run a certain magic, the Matplotlib magic:

%matplotlib inline
  

We need to give it the inline parameter, and now we can create a Matplotlib figure. Notice that the image shows up right below the command. When we type in _8, it shows that a Matplotlib object was created, but it does not actually show the plot itself. As you can see, we can use the Jupyter console in a more advanced way than the typical Python console. For example, let's work with a dataset called Iris; import it using the following line:

from sklearn.datasets import load_iris
  

This is a very common dataset used in data analysis. It's often used as a way to evaluate training models. We will also use k-means clustering on this:

from sklearn.cluster import KMeans
  

The load_Iris function isn't actually the Iris dataset; it is a function that we can use to get the Iris dataset. The following command will actually give us access to that dataset:

iris  = load_iris()
  

Now we will train a k-means clustering scheme on this dataset:

iris_clusters = KMeans(n_clusters = 3, init =  "random").fit(iris.data)
  

We can see the documentation right away when we're typing in a function. For example, I know what the end clusters parameter means; it is actually the original doc string from the function. Here, I want the number of clusters to be 3, because I know that there are actually three real clusters in this dataset. Now that a clustering scheme has been trained, we can plot it using the following code:

plt.scatter(iris.data[:, 0], iris.data[:, 1], c = iris_clusters.labels_)
  

Spyder

Spyder is an IDE unlike the Jupyter Notebook or the Jupyter QT Console. It integrates NumPy, SciPy, Matplotlib, and IPython. It is extensible with plugins, and it is included with Anaconda.

The following screenshot shows Spyder, an actual IDE intended for data analysis and scientific computing:

Spyder Python 3.6

On the right, you can go to File explorer to search for new files to load. Here, we want to open up iris_kmeans.py. This is a file that contains all the commands that we used before in the Jupyter QT Console. Notice on the right that the editor has a console; that is in fact the IPython console, which you saw as the Jupyter QT Console. We can run this entire file by clicking on the Run tab. It will run in the console, shown as follows:

The following screenshot will be the output:

Notice that at the end we see the result of the clustering that we saw before. We can type in commands interactively as well; for example, we can make our computer say Hello, world!.

In the editor, let's type in a new variable, let's say n = 5. Now let's run this file in the editor. Notice that n is a variable that the editor is aware of. Now let's make a change, say n = 6. Unless we were to actually run this file again, the console will be unaware of the change. So if I were to type n in the console again, nothing changes, and it's still 5. You would need to run this line in order to actually see a change.

We also have a variable explorer where we can see the values of variables and change them. For example, I can change the value of n from 6 to 10, shown as follows:

The following screenshot shows the output:

Then, when I go to the console and ask what n is, it will say 10:

n
10
  

That concludes our discussion of Spyder.

Rodeo

Rodeo is a Python IDE developed by Yhat, and is intended for data analysis applications exclusively. It is intended to emulate the RStudio IDE, which is popular among R users, and it can be downloaded from Rodeo's website. The only advantage of the base Python interpreter is that every Python installation includes it, shown as follows:

ptpython

What may be a lesser known console-based Python REPL is ptpython, designed by Jonathan Slenders. It exists only in the console and is an independent project by him. You can find it on GitHub. It has lightweight features, yet it also includes syntax highlighting, autocompletion, and even IPython. It can be installed with the following command:

pip install ptpython
  

That concludes our discussion on alternatives to the Jupyter Notebooks.

Package management with Conda

We will now discuss package management with Conda. In this section, we're going to take a look at the following topics:

  • What is Conda?
  • Managing Conda environments
  • Managing Python with Conda
  • Managing packages with Conda

What is Conda?

So what is Conda? Conda is the Anaconda package manager. Conda allows us to create and manage multiple environments, allowing multiple versions of Python, R, and their relevant packages to exist. This can be very useful if you need to develop for different systems with different versions of Python and their packages. Conda allows you to manage Python and R versions, and it also facilitates installation and management of packages.

Conda environment management

A Conda environment allows developers to use and manage different versions of Python in its packages. This can be useful for testing and development on legacy systems. Environments can be saved, cloned, and exported so that others can replicate results.

Here are some common environment management commands.

For environment creation:

conda create --name env_name prog1 prog2
conda create --name env_name python=3 prog3
  

For listing environments:

conda env list
  

To verify the environment:

conda info --envs
  

To clone the environment:

conda create --name new_env --clone old_env
  

To remove environments:

conda remove --name env_name -all
  

Users can share environments by creating a YAML file, which recipients can use to construct an identical environment. You can do this by hand, where you effectively replicate what Anaconda would make, but it is much easier to have Anaconda create a YAML file for you.

After you have created such a file, or if you've received this file from another user, it is very easy to create a new environment.

Managing Python

As mentioned earlier, Anaconda allows you to manage multiple versions of Python. It is possible to search and see which versions of Python are available for installation. You can verify which version of Python is in an environment, and you can even create environments for Python 2.7. You can also update the version of Python that is in a current environment.

Package management

Let's suppose that we're interested in installing the package selenium, which is a package that is used for web scraping and also web testing. We can list the packages that are currently installed, and we can give the command to install a new package.

First, we should search to see whether the package is available from the Conda system. Not all packages that are available on pip are available from Conda. That said, it is in fact possible to install a package available from pip, although hopefully, if we wish to install a package, we can use the following command:

conda install selenium
  

If selenium is the package we're interested in, it can be downloaded automatically from the internet, unless you have a file that Anaconda can install directly from your system.

To install packages via pip, use the following:

pip install package_name
  

Packages, of course, can be removed as follows:

conda remove selenium
  

Setting up a database

We'll now begin discussing setting up a database for you to use. In this section, we're going to look at the following topics:

  • Installing MySQL
  • Installing MySQL connector for Python
  • Creating, using, and deleting databases

MySQL connector is necessary in order to use MySQL with Python. There are many SQL database implementations in existence, and while MySQL may not be the simplest database management system, it is full-featured, it is industrial-strength, it is commonly seen in real world situations, and furthermore, it is free and open source, which means it's an excellent tool to learn on. You can obtain the MySQL Community Edition, which is the free and open source version, from MySQL's website (go to https://dev.mysql.com/downloads/).

Installing MySQL

For Linux systems, if it's possible, I recommend that you install MySQL using whatever package management system is available to you. Perhaps go for YUM, if you're using a Red-Hat-based distribution, APT if you're using a Debian-based distro, or SUSE's repository system. If you do not have a package management system, you may need to install MySQL from the source.

Windows users can install MySQL directly from their website. You should also be aware that MySQL comes in 32-bit and 64-bit binaries, but whatever program you download will likely install the correct version for your system.

Here is the web page from where you can download MySQL for Windows:

I recommend that you use the MySQL Installer. Scroll down, and when you're looking for which binary to download, be aware that this first binary says web community. This is going to be an installer that downloads MySQL from the internet as you're doing the installation. Notice that it's much smaller than the other binary. It basically includes everything you need in order to be able to install MySQL. This would be the one I would recommend you download if you're following along.

There are generally available releases; these should be stable. Next to the generally available releases tab are the development releases; I recommend that you do not download these unless you know what you're doing.

MySQL connectors

MySQL functions like a driver on your system, and other applications interact with MySQL as if it were a driver. So, you will need to download a MySQL connector in order to be able to use MySQL with Python. This will allow Python to communicate with MySQL. What you will end up doing is loading in a package, and you will start up a connection with MySQL. The Python connector can be downloaded from MySQL's website (go to https://dev.mysql.com/downloads/connector/).

This web page is universal for any operating system, so you will need to select the appropriate platform, such as Linux, OS X, or Windows. You'll need to select and download the installer best matching the system's architecture, whether you have a 32-bit or 64-bit, and the version of Python. And then you will use the install wizard in order to install it on your system.

Here is the page for downloading and installing the connector:

Notice that we can choose here which platform is appropriate. We even have platform-independent and source code versions. It may also be possible to install this using a package management system, such as APT if you're using a Debian-based system, Ubuntu or YUM if you're using a Red-Hat-based system, and so on. We have many different installers, so we will need to be aware which version of Python we're using. It is recommended that you use the version that is closest to the one that is actually being used in your project. You'll also need to choose between 32-bit and 64-bit. Then you click on download and follow the instructions of the installer.

So, database management is a major topic; to go into everything about database management would take us well beyond the scope of this book. We're not going to talk about how a good database is designed; I recommend that you go to another resource, perhaps another Packt product that would explain these topics, because they are important. Regarding SQL, we will tell you only the commands that you need to use SQL at a basic level. There's also no discussion on permissions, so we're going to assume that your database gives full permission to whichever user is using it, and there's only one user at a time.

Creating a database

After installing MySQL in the MySQL command line, we can create a database with the following command, with the name of the database after it:

create database
  

Every command must be ended by a semicolon; otherwise, MySQL will wait until the command is actually finished.

You can see all available databases with this command:

show databases

We can specify which database we want to use with the following command:

use database_name
  

If we wish to delete a database, we can do so with the following command:

drop database database_name
  

Here is the MySQL command line:

Let's practice managing databases. We can create a database with the following command:

create database mydb
  

To see all databases, we can use this command:

show databases
  

There are multiple databases here, some of which are from other projects, but as you can see, the database mydb, which we just created, is shown as follows:

If we want to use this database, the command use mydb can be used. MySQL says the database has been changed. What this means is that when I issue commands such as creating tables, reading from tables, or adding new data, all of this will be done with the database mydb.

Let's say we want to delete the database mydb; we can do so with the following command:

drop database mydb
  

This will delete the database.

Summary

In this chapter, we were introduced to Anaconda, learned why it is a useful starting point, downloaded it, and installed it. We explored some alternatives to Jupyter, covered managing the Anaconda package, and also learned how to set up a MySQL database. Nevertheless, throughout the rest of the book, we'll presume Anaconda has been installed. In the next chapter, we will talk about using NumPy, a useful package in data analysis. Without this package, data analysis with Python would be all but impossible.

Left arrow icon Right arrow icon
Download code icon Download Code

Key benefits

  • Explore the tools you need to become a data analyst
  • Discover practical examples to help you grasp data processing concepts
  • Walk through hierarchical indexing and grouping for data analysis

Description

Python, a multi-paradigm programming language, has become the language of choice for data scientists for visualization, data analysis, and machine learning. Hands-On Data Analysis with NumPy and Pandas starts by guiding you in setting up the right environment for data analysis with Python, along with helping you install the correct Python distribution. In addition to this, you will work with the Jupyter notebook and set up a database. Once you have covered Jupyter, you will dig deep into Python’s NumPy package, a powerful extension with advanced mathematical functions. You will then move on to creating NumPy arrays and employing different array methods and functions. You will explore Python’s pandas extension which will help you get to grips with data mining and learn to subset your data. Last but not the least you will grasp how to manage your datasets by sorting and ranking them. By the end of this book, you will have learned to index and group your data for sophisticated data analysis and manipulation.

Who is this book for?

Hands-On Data Analysis with NumPy and Pandas is for you if you are a Python developer and want to take your first steps into the world of data analysis. No previous experience of data analysis is required to enjoy this book.

What you will learn

  • Understand how to install and manage Anaconda
  • Read, sort, and map data using NumPy and pandas
  • Find out how to create and slice data arrays using NumPy
  • Discover how to subset your DataFrames using pandas
  • Handle missing data in a pandas DataFrame
  • Explore hierarchical indexing and plotting with pandas

Product Details

Country selected
Publication date, Length, Edition, Language, ISBN-13
Publication date : Jun 29, 2018
Length: 168 pages
Edition : 1st
Language : English
ISBN-13 : 9781789530797
Category :
Languages :
Concepts :
Tools :

What do you get with a Packt Subscription?

Free for first 7 days. $19.99 p/m after that. Cancel any time!
Product feature icon Unlimited ad-free access to the largest independent learning library in tech. Access this title and thousands more!
Product feature icon 50+ new titles added per month, including many first-to-market concepts and exclusive early access to books as they are being written.
Product feature icon Innovative learning tools, including AI book assistants, code context explainers, and text-to-speech.
Product feature icon Thousands of reference materials covering every tech concept you need to stay up to date.
Subscribe now
View plans & pricing

Product Details

Publication date : Jun 29, 2018
Length: 168 pages
Edition : 1st
Language : English
ISBN-13 : 9781789530797
Category :
Languages :
Concepts :
Tools :

Packt Subscriptions

See our plans and pricing
Modal Close icon
$19.99 billed monthly
Feature tick icon Unlimited access to Packt's library of 7,000+ practical books and videos
Feature tick icon Constantly refreshed with 50+ new titles a month
Feature tick icon Exclusive Early access to books as they're written
Feature tick icon Solve problems while you work with advanced search and reference features
Feature tick icon Offline reading on the mobile app
Feature tick icon Simple pricing, no contract
$199.99 billed annually
Feature tick icon Unlimited access to Packt's library of 7,000+ practical books and videos
Feature tick icon Constantly refreshed with 50+ new titles a month
Feature tick icon Exclusive Early access to books as they're written
Feature tick icon Solve problems while you work with advanced search and reference features
Feature tick icon Offline reading on the mobile app
Feature tick icon Choose a DRM-free eBook or Video every month to keep
Feature tick icon PLUS own as many other DRM-free eBooks or Videos as you like for just $5 each
Feature tick icon Exclusive print discounts
$279.99 billed in 18 months
Feature tick icon Unlimited access to Packt's library of 7,000+ practical books and videos
Feature tick icon Constantly refreshed with 50+ new titles a month
Feature tick icon Exclusive Early access to books as they're written
Feature tick icon Solve problems while you work with advanced search and reference features
Feature tick icon Offline reading on the mobile app
Feature tick icon Choose a DRM-free eBook or Video every month to keep
Feature tick icon PLUS own as many other DRM-free eBooks or Videos as you like for just $5 each
Feature tick icon Exclusive print discounts

Frequently bought together


Stars icon
Total $ 110.97
Mastering Numerical Computing with NumPy
$38.99
Hands-On Data Analysis with NumPy and pandas
$32.99
SciPy Recipes
$38.99
Total $ 110.97 Stars icon
Banner background image

Table of Contents

7 Chapters
Setting Up a Python Data Analysis Environment Chevron down icon Chevron up icon
Diving into NumPY Chevron down icon Chevron up icon
Operations on NumPy Arrays Chevron down icon Chevron up icon
pandas are Fun! What is pandas? Chevron down icon Chevron up icon
Arithmetic, Function Application, and Mapping with pandas Chevron down icon Chevron up icon
Managing, Indexing, and Plotting Chevron down icon Chevron up icon
Other Books You May Enjoy Chevron down icon Chevron up icon

Customer reviews

Top Reviews
Rating distribution
Full star icon Full star icon Half star icon Empty star icon Empty star icon 2.9
(7 Ratings)
5 star 28.6%
4 star 14.3%
3 star 0%
2 star 28.6%
1 star 28.6%
Filter icon Filter
Top Reviews

Filter reviews by




Akshay Jan 02, 2024
Full star icon Full star icon Full star icon Full star icon Full star icon 5
Excellent book that gets down to the basics!
Subscriber review Packt
S. Sankara Subramanian Sep 10, 2018
Full star icon Full star icon Full star icon Full star icon Full star icon 5
no specific comments
Amazon Verified review Amazon
Amazon Customer Aug 27, 2018
Full star icon Full star icon Full star icon Full star icon Empty star icon 4
Would recommend this book to those with a background in data analysis and are untrained in using Python.Pros - This book delivers exactly what is written in the title, no more, no less. The writing style is introductory and there are plenty of examples. The book addresses how to clean data using Python which is mandatory when performing data analysis. Examples discussed in this book could be used to supplement references which are less practical.Cons - The editing uses incorrect fonts on words that refer to technical terms. For example, some Python functions in this book are type-font, but the editor frequently omits this formatting. Many screenshots include cursors. Some sections, such as the linear algebra section, explain how to implement code but do not explain the context or give references.
Amazon Verified review Amazon
BBCReview Sep 30, 2021
Full star icon Full star icon Empty star icon Empty star icon Empty star icon 2
The books has good content on Numpy and Pandas, but you can't read the code snippets without a magnifying glass, or worse yet, zooming each one. Not the fault of the author, but it's darn hard to follow when it take 10 seconds to read each each snippet.
Amazon Verified review Amazon
Philip H Sep 15, 2018
Full star icon Full star icon Empty star icon Empty star icon Empty star icon 2
The explanations are reasonable although the book could have been written much more concisely. The examples are written in tiny fonts
Amazon Verified review Amazon
Get free access to Packt library with over 7500+ books and video courses for 7 days!
Start Free Trial

FAQs

What is included in a Packt subscription? Chevron down icon Chevron up icon

A subscription provides you with full access to view all Packt and licnesed content online, this includes exclusive access to Early Access titles. Depending on the tier chosen you can also earn credits and discounts to use for owning content

How can I cancel my subscription? Chevron down icon Chevron up icon

To cancel your subscription with us simply go to the account page - found in the top right of the page or at https://subscription.packtpub.com/my-account/subscription - From here you will see the ‘cancel subscription’ button in the grey box with your subscription information in.

What are credits? Chevron down icon Chevron up icon

Credits can be earned from reading 40 section of any title within the payment cycle - a month starting from the day of subscription payment. You also earn a Credit every month if you subscribe to our annual or 18 month plans. Credits can be used to buy books DRM free, the same way that you would pay for a book. Your credits can be found in the subscription homepage - subscription.packtpub.com - clicking on ‘the my’ library dropdown and selecting ‘credits’.

What happens if an Early Access Course is cancelled? Chevron down icon Chevron up icon

Projects are rarely cancelled, but sometimes it's unavoidable. If an Early Access course is cancelled or excessively delayed, you can exchange your purchase for another course. For further details, please contact us here.

Where can I send feedback about an Early Access title? Chevron down icon Chevron up icon

If you have any feedback about the product you're reading, or Early Access in general, then please fill out a contact form here and we'll make sure the feedback gets to the right team. 

Can I download the code files for Early Access titles? Chevron down icon Chevron up icon

We try to ensure that all books in Early Access have code available to use, download, and fork on GitHub. This helps us be more agile in the development of the book, and helps keep the often changing code base of new versions and new technologies as up to date as possible. Unfortunately, however, there will be rare cases when it is not possible for us to have downloadable code samples available until publication.

When we publish the book, the code files will also be available to download from the Packt website.

How accurate is the publication date? Chevron down icon Chevron up icon

The publication date is as accurate as we can be at any point in the project. Unfortunately, delays can happen. Often those delays are out of our control, such as changes to the technology code base or delays in the tech release. We do our best to give you an accurate estimate of the publication date at any given time, and as more chapters are delivered, the more accurate the delivery date will become.

How will I know when new chapters are ready? Chevron down icon Chevron up icon

We'll let you know every time there has been an update to a course that you've bought in Early Access. You'll get an email to let you know there has been a new chapter, or a change to a previous chapter. The new chapters are automatically added to your account, so you can also check back there any time you're ready and download or read them online.

I am a Packt subscriber, do I get Early Access? Chevron down icon Chevron up icon

Yes, all Early Access content is fully available through your subscription. You will need to have a paid for or active trial subscription in order to access all titles.

How is Early Access delivered? Chevron down icon Chevron up icon

Early Access is currently only available as a PDF or through our online reader. As we make changes or add new chapters, the files in your Packt account will be updated so you can download them again or view them online immediately.

How do I buy Early Access content? Chevron down icon Chevron up icon

Early Access is a way of us getting our content to you quicker, but the method of buying the Early Access course is still the same. Just find the course you want to buy, go through the check-out steps, and you’ll get a confirmation email from us with information and a link to the relevant Early Access courses.

What is Early Access? Chevron down icon Chevron up icon

Keeping up to date with the latest technology is difficult; new versions, new frameworks, new techniques. This feature gives you a head-start to our content, as it's being created. With Early Access you'll receive each chapter as it's written, and get regular updates throughout the product's development, as well as the final course as soon as it's ready.We created Early Access as a means of giving you the information you need, as soon as it's available. As we go through the process of developing a course, 99% of it can be ready but we can't publish until that last 1% falls in to place. Early Access helps to unlock the potential of our content early, to help you start your learning when you need it most. You not only get access to every chapter as it's delivered, edited, and updated, but you'll also get the finalized, DRM-free product to download in any format you want when it's published. As a member of Packt, you'll also be eligible for our exclusive offers, including a free course every day, and discounts on new and popular titles.