Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Free Learning
Arrow right icon
Arrow up icon
GO TO TOP
Python Real-World Projects

You're reading from   Python Real-World Projects Craft your Python portfolio with deployable applications

Arrow left icon
Product type Paperback
Published in Sep 2023
Publisher Packt
ISBN-13 9781803246765
Length 478 pages
Edition 1st Edition
Languages
Arrow right icon
Author (1):
Arrow left icon
Steven F. Lott Steven F. Lott
Author Profile Icon Steven F. Lott
Steven F. Lott
Arrow right icon
View More author details
Toc

Table of Contents (20) Chapters Close

Preface 1. Chapter 1: Project Zero: A Template for Other Projects 2. Chapter 2: Overview of the Projects FREE CHAPTER 3. Chapter 3: Project 1.1: Data Acquisition Base Application 4. Chapter 4: Data Acquisition Features: Web APIs and Scraping 5. Chapter 5: Data Acquisition Features: SQL Database 6. Chapter 6: Project 2.1: Data Inspection Notebook 7. Chapter 7: Data Inspection Features 8. Chapter 8: Project 2.5: Schema and Metadata 9. Chapter 9: Project 3.1: Data Cleaning Base Application 10. Chapter 10: Data Cleaning Features 11. Chapter 11: Project 3.7: Interim Data Persistence 12. Chapter 12: Project 3.8: Integrated Data Acquisition Web Service 13. Chapter 13: Project 4.1: Visual Analysis Techniques 14. Chapter 14: Project 4.2: Creating Reports 15. Chapter 15: Project 5.1: Modeling Base Application 16. Chapter 16: Project 5.2: Simple Multivariate Statistics 17. Chapter 17: Next Steps 18. Other Books You Might Enjoy 19. Index

2.2 Acquisition via Extract

Since data formats are in a constant state of flux, it’s helpful to understand how to add and modify data formats. These projects will all build on Project 1.1 by adding features to the base application. The following projects are designed around alternative sources for data:

  • Project 1.2: ”Acquire Web Data from an API”. This project will acquire data from web services using JSON format.

  • Project 1.3: ”Acquire Web Data from HTML”. This project will acquire data from a web page by scraping the HTML.

  • Two separate projects are part of gathering data from a SQL database:

    • Project 1.4: ”Build a Local Database”. This is a necessary sidebar project to build a local SQL database. This is necessary because SQL databases accessible by the public are a rarity. It’s more secure to build our own demonstration database.

    • Project 1.5: ”Acquire Data from a Local Database”. Once a database is available, we can acquire data from a SQL extract.

These projects will focus on data represented as text. For CSV files, the data is text; an application must convert it to a more useful Python type. HTML pages, also, are pure text. Sometimes, additional attributes are provided to suggest the text should be treated as a number. A SQL database is often populated with non-text data. To be consistent, the SQL data should be serialized as text. The acquisition applications all share a common approach of working with text.

These applications will also minimize the transformations applied to the source data. To process the data consistently, it’s helpful to make a shift to a common format. As we’ll see in Chapter 3, Project 1.1: Data Acquisition Base Application the NDJSON format provides a useful structure that can often be mapped back to source files.

After acquiring new data, it’s prudent to do a manual inspection. This is often done a few times at the start of application development. After that, inspection is only done to diagnose problems with the source data. The next few chapters will cover projects to inspect data.

You have been reading a chapter from
Python Real-World Projects
Published in: Sep 2023
Publisher: Packt
ISBN-13: 9781803246765
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $19.99/month. Cancel anytime
Banner background image