You're reading from Biostatistics with Python Apply Python for biostatistics with hands-on biomedical and biotechnology projects

Product type Paperback

Published in Nov 2024

Publisher Packt

ISBN-13 9781837630967

Length 374 pages

Edition 1st Edition

Languages

Python

Tools

NetApp ONTAP

Concepts

Statistics

Author (1):

Darko Medin

View More author details

Table of Contents (24) Chapters

Preface

1. Part 1:Introduction to Biostatistics and Getting Started with Python

2. Chapter 1: Introduction to Biostatistics FREE CHAPTER

3. Chapter 2: Getting Started with Python for Biostatistics

4. Chapter 3: Exercise 1 – Cleaning and Describing Data Using Python

5. Chapter 4: Part 1 Exemplar Project – Load, Clean, and Describe Diabetes Data in Python

6. Part 2:Introduction to Python for Biostatistics – Methodology and Examples

7. Chapter 5: Introduction to Python for Biostatistics

8. Chapter 6: Biostatistical Inference Using Hypothesis Tests and Effect Sizes

9. Chapter 7: Predictive Biostatistics Using Python

10. Chapter 8: Part 2 Exercise – T-Test, ANOVA, and Linear and Logistic Regression

11. Chapter 9: Biostatistical Inference and Predictive Analytics Using Cardiovascular Study Data

12. Part 3:Clinical Study Design, Analysis, and Synthesizing Evidence

13. Chapter 10: Clinical Study Design

14. Chapter 11: Survival Analysis in Biomedical Research

15. Chapter 12: Meta-Analysis – Synthesizing Evidence from Multiple Studies

16. Chapter 13: Survival Predictive Analysis and Meta-Analysis Practice

17. Chapter 14: Part 3 Exemplar Project – Meta-Analysis of Survival Data in Clinical Research

18. Part 4:Biological and Statistical Variables and Frameworks, and a Final Practical Project from the Field of Biology

19. Chapter 15: Understanding Biological Variables

20. Chapter 16: Data Analysis Frameworks and Performance for Life Sciences Research

21. Chapter 17: Part 4 Exercise – Performing Statistics for Biology Studies in Python

22. Index

Why subscribe?

23. Other Books You May Enjoy

Validating and describing the Diabetes dataset

After we load and examine the dataset, we can take the next step, which is validating and describing the Diabetes dataset. This includes several procedures, including checking for missing values (nan), simplifying the dataset structure and removing the unnecessary variables, fixing potential wrong names in the classes (the CLASS variable) and categories, and making sure the structure of the dataset is as described on the official website.

We will perform each of these procedures here.

First, we check for missing values as follows:

#The .sum() after the isna() outputting the number of empty cells
data.isna().sum()

As there are no missing values (the output of data.isna().sum is 0), you can proceed to simplify the dataset structure and remove the unnecessary variables. For this project, the ID and No_Pation variables (which are just unique identifiers for samples) are not needed, so they are removed to have a simpler structure...

The rest of the chapter is locked

You're reading from Biostatistics with Python Apply Python for biostatistics with hands-on biomedical and biotechnology projects

Table of Contents (24) Chapters

Validating and describing the Diabetes dataset

Authors (1)

Personalised recommendations for you

You're reading from Biostatistics with Python Apply Python for biostatistics with hands-on biomedical and biotechnology projects

Table of Contents (24) Chapters

Validating and describing the Diabetes dataset

Unlock this book and the full library FREE for 7 days

Authors (1)

Personalised recommendations for you