You're reading from Biostatistics with Python Apply Python for biostatistics with hands-on biomedical and biotechnology projects

Product type Paperback

Published in Nov 2024

Publisher Packt

ISBN-13 9781837630967

Length 374 pages

Edition 1st Edition

Languages

Python

Tools

NetApp ONTAP

Concepts

Statistics

Author (1):

Darko Medin

View More author details

Table of Contents (24) Chapters

Preface

1. Part 1:Introduction to Biostatistics and Getting Started with Python

2. Chapter 1: Introduction to Biostatistics FREE CHAPTER

3. Chapter 2: Getting Started with Python for Biostatistics

4. Chapter 3: Exercise 1 – Cleaning and Describing Data Using Python

5. Chapter 4: Part 1 Exemplar Project – Load, Clean, and Describe Diabetes Data in Python

6. Part 2:Introduction to Python for Biostatistics – Methodology and Examples

7. Chapter 5: Introduction to Python for Biostatistics

8. Chapter 6: Biostatistical Inference Using Hypothesis Tests and Effect Sizes

9. Chapter 7: Predictive Biostatistics Using Python

10. Chapter 8: Part 2 Exercise – T-Test, ANOVA, and Linear and Logistic Regression

11. Chapter 9: Biostatistical Inference and Predictive Analytics Using Cardiovascular Study Data

12. Part 3:Clinical Study Design, Analysis, and Synthesizing Evidence

13. Chapter 10: Clinical Study Design

14. Chapter 11: Survival Analysis in Biomedical Research

15. Chapter 12: Meta-Analysis – Synthesizing Evidence from Multiple Studies

16. Chapter 13: Survival Predictive Analysis and Meta-Analysis Practice

17. Chapter 14: Part 3 Exemplar Project – Meta-Analysis of Survival Data in Clinical Research

18. Part 4:Biological and Statistical Variables and Frameworks, and a Final Practical Project from the Field of Biology

19. Chapter 15: Understanding Biological Variables

20. Chapter 16: Data Analysis Frameworks and Performance for Life Sciences Research

21. Chapter 17: Part 4 Exercise – Performing Statistics for Biology Studies in Python

22. Index

Why subscribe?

23. Other Books You May Enjoy

Cleaning missing values and invalid data

By default, the pandas read_csv() function will read a variable as if it’s non-numeric (string) if it contains at least one string (text). So, what’s the difference between nan instances in the Petal_width column and the Sepal_width column? Python will convert empty cells into nan values but will keep the numeric nature of the variable, as is the case for the Petal_length variable.

In biostatistics, experimenters might use different words to mark a missing value, such as Nan or NA (short for not applicable), or even whole words such as missing or not applicable. Remember that Nan and NA are still strings, so if there’s an empty cell, Python will read it as a string and coerce the whole variable into a string variable. This wasn’t the case for Petal_width since Python read empty cells as nan and didn’t coerce the variable into a string, instead keeping it numeric. In this case, Python read nan as the valid...

The rest of the chapter is locked

You're reading from Biostatistics with Python Apply Python for biostatistics with hands-on biomedical and biotechnology projects

Table of Contents (24) Chapters

Cleaning missing values and invalid data

Authors (1)

Personalised recommendations for you

You're reading from Biostatistics with Python Apply Python for biostatistics with hands-on biomedical and biotechnology projects

Table of Contents (24) Chapters

Cleaning missing values and invalid data

Unlock this book and the full library FREE for 7 days

Authors (1)

Personalised recommendations for you