Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Free Learning
Arrow right icon
Arrow up icon
GO TO TOP
Python: End-to-end Data Analysis

You're reading from   Python: End-to-end Data Analysis Leverage the power of Python to clean, scrape, analyze, and visualize your data

Arrow left icon
Product type Course
Published in May 2017
Publisher Packt
ISBN-13 9781788394697
Length 931 pages
Edition 1st Edition
Languages
Tools
Arrow right icon
Authors (5):
Arrow left icon
Luiz Felipe Martins Luiz Felipe Martins
Author Profile Icon Luiz Felipe Martins
Luiz Felipe Martins
Ivan Idris Ivan Idris
Author Profile Icon Ivan Idris
Ivan Idris
Phuong Vo.T.H Phuong Vo.T.H
Author Profile Icon Phuong Vo.T.H
Phuong Vo.T.H
Martin Czygan Martin Czygan
Author Profile Icon Martin Czygan
Martin Czygan
Magnus Vilhelm Persson Magnus Vilhelm Persson
Author Profile Icon Magnus Vilhelm Persson
Magnus Vilhelm Persson
+1 more Show less
Arrow right icon
View More author details
Toc

What you need for this learning path

Module 1:

There are not too many requirements to get started. You will need a Python programming environment installed on your system. Under Linux and Mac OS X, Python is usually installed by default. Installation on Windows is supported by an excellent installer provided and maintained by the community.This book uses a recent Python 2, but many examples will work with Python 3as well.

The versions of the libraries used in this book are the following: NumPy 1.9.2,Pandas 0.16.2, matplotlib 1.4.3, tables 3.2.2, pymongo 3.0.3, redis 2.10.3, and scikit-learn 0.16.1. As these packages are all hosted on PyPI, the Python package index, they can be easily installed with pip. To install NumPy, you would write:

$ pip install numpy

If you are not using them already, we suggest you take a look at virtual environments for managing isolating Python environment on your computer.For Python 2, there are two packages of interest there: virtualenv and virtualenvwrapper. Since Python 3.3, there is a tool in the standard library called pyvenv (https://docs.python.org/3/library/venv.html), which serves the same purpose.

Most libraries will have an attribute for the version, so if you already have a library installed, you can quickly check its version:

>>>importredis

>>>redis.__version__'2.10.3'

This works well for most libraries. A few, such as pymongo, use a different attribute(pymongo uses just version, without the underscores).While all the examples can be run interactively in a Python shell, we recommend using IPython. IPython started as a more versatile Python shell, but has since evolved into a powerful tool for exploration and sharing. We used IPython 4.0.0 withPython 2.7.10. IPython is a great way to work interactively with Python, be it in the terminal or in the browser.

Module 2:

First, you need a Python 3 distribution. I recommend the full Anaconda distribution as it comes with the majority of the software we need. I tested the code with Python 3.4 and the following packages:

• joblib 0.8.4

• IPython 3.2.1

• NetworkX 1.9.1

• NLTK 3.0.2

• Numexpr 2.3.1

• pandas 0.16.2

• SciPy 0.16.0

• seaborn 0.6.0

• sqlalchemy 0.9.9

• statsmodels 0.6.1

• matplotlib 1.5.0

• NumPy 1.10.1

• scikit-learn 0.17

• dautil0.0.1a29

For some recipes, you need to install extra software, but this is explained whenever the software is required.

Module 3:

All you need to follow through the examples in this book is a computer running any recent version of Python. While the examples use Python 3, they can easily be adapted to work with Python 2, with only minor changes. The packages used in the examples are NumPy, SciPy, matplotlib, Pandas, stats models, PyMC, Scikit-learn. Optionally, the packages basemap and cartopy are used to plot coordinate points on maps. The easiest way to obtain and maintain a Python environment that meets all the requirements of this book is to download a prepackaged Python distribution. In this book, we have checked all the code against Continuum Analytics' Anaconda Python distribution and Ubuntu XenialXerus (16.04) running Python 3.

To download the example data and code, an Internet connection is needed.

lock icon The rest of the chapter is locked
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $19.99/month. Cancel anytime
Banner background image