Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Free Learning
Arrow right icon
Arrow up icon
GO TO TOP
Hands-On Exploratory Data Analysis with Python

You're reading from   Hands-On Exploratory Data Analysis with Python Perform EDA techniques to understand, summarize, and investigate your data

Arrow left icon
Product type Paperback
Published in Mar 2020
Publisher Packt
ISBN-13 9781789537253
Length 352 pages
Edition 1st Edition
Languages
Tools
Arrow right icon
Authors (2):
Arrow left icon
Suresh Kumar Mukhiya Suresh Kumar Mukhiya
Author Profile Icon Suresh Kumar Mukhiya
Suresh Kumar Mukhiya
Usman Ahmed Usman Ahmed
Author Profile Icon Usman Ahmed
Usman Ahmed
Arrow right icon
View More author details
Toc

Table of Contents (17) Chapters Close

Preface 1. Section 1: The Fundamentals of EDA
2. Exploratory Data Analysis Fundamentals FREE CHAPTER 3. Visual Aids for EDA 4. EDA with Personal Email 5. Data Transformation 6. Section 2: Descriptive Statistics
7. Descriptive Statistics 8. Grouping Datasets 9. Correlation 10. Time Series Analysis 11. Section 3: Model Development and Evaluation
12. Hypothesis Testing and Regression 13. Model Development and Evaluation 14. EDA on Wine Quality Data Analysis 15. Other Books You May Enjoy Appendix

Using pandas vectorized string functions

For string formatting, it would be better to use a dataset that's a little messier. We will use the dataset that I collected during my Ph.D. research study when writing a review paper. It can be found here: https://raw.githubusercontent.com/sureshHARDIYA/phd-resources/master/Data/Review%20Paper/preprocessed.csv.

  1. Let's load this text article and then display the first eight entries. Let's start by loading the data and checking its structure and a few of the comments, as follows:
import numpy as np
import pandas as pd
import os

  1. Next, let's read the text file and display the last 10 items, as follows:
text = pd.read_csv("https://raw.githubusercontent.com/sureshHARDIYA/phd-resources/master/Data/Review%20Paper/preprocessed.csv")
text = text["TITLE"]
print (text.shape)
print( text.tail(10))
  1. The output...
lock icon The rest of the chapter is locked
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $19.99/month. Cancel anytime
Banner background image