Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Free Learning
Arrow right icon
Arrow up icon
GO TO TOP
Biostatistics with Python

You're reading from   Biostatistics with Python Apply Python for biostatistics with hands-on biomedical and biotechnology projects

Arrow left icon
Product type Paperback
Published in Nov 2024
Publisher Packt
ISBN-13 9781837630967
Length 374 pages
Edition 1st Edition
Languages
Concepts
Arrow right icon
Author (1):
Arrow left icon
Darko Medin Darko Medin
Author Profile Icon Darko Medin
Darko Medin
Arrow right icon
View More author details
Toc

Table of Contents (24) Chapters Close

Preface 1. Part 1:Introduction to Biostatistics and Getting Started with Python
2. Chapter 1: Introduction to Biostatistics FREE CHAPTER 3. Chapter 2: Getting Started with Python for Biostatistics 4. Chapter 3: Exercise 1 – Cleaning and Describing Data Using Python 5. Chapter 4: Part 1 Exemplar Project – Load, Clean, and Describe Diabetes Data in Python 6. Part 2:Introduction to Python for Biostatistics – Methodology and Examples
7. Chapter 5: Introduction to Python for Biostatistics 8. Chapter 6: Biostatistical Inference Using Hypothesis Tests and Effect Sizes 9. Chapter 7: Predictive Biostatistics Using Python 10. Chapter 8: Part 2 Exercise – T-Test, ANOVA, and Linear and Logistic Regression 11. Chapter 9: Biostatistical Inference and Predictive Analytics Using Cardiovascular Study Data 12. Part 3:Clinical Study Design, Analysis, and Synthesizing Evidence
13. Chapter 10: Clinical Study Design 14. Chapter 11: Survival Analysis in Biomedical Research 15. Chapter 12: Meta-Analysis – Synthesizing Evidence from Multiple Studies 16. Chapter 13: Survival Predictive Analysis and Meta-Analysis Practice 17. Chapter 14: Part 3 Exemplar Project – Meta-Analysis of Survival Data in Clinical Research 18. Part 4:Biological and Statistical Variables and Frameworks, and a Final Practical Project from the Field of Biology
19. Chapter 15: Understanding Biological Variables 20. Chapter 16: Data Analysis Frameworks and Performance for Life Sciences Research 21. Chapter 17: Part 4 Exercise – Performing Statistics for Biology Studies in Python 22. Index 23. Other Books You May Enjoy

Validating and describing the Diabetes dataset

After we load and examine the dataset, we can take the next step, which is validating and describing the Diabetes dataset. This includes several procedures, including checking for missing values (nan), simplifying the dataset structure and removing the unnecessary variables, fixing potential wrong names in the classes (the CLASS variable) and categories, and making sure the structure of the dataset is as described on the official website.

We will perform each of these procedures here.

First, we check for missing values as follows:

#The .sum() after the isna() outputting the number of empty cells
data.isna().sum()

As there are no missing values (the output of data.isna().sum is 0), you can proceed to simplify the dataset structure and remove the unnecessary variables. For this project, the ID and No_Pation variables (which are just unique identifiers for samples) are not needed, so they are removed to have a simpler structure...

lock icon The rest of the chapter is locked
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $19.99/month. Cancel anytime
Banner background image