Book

Hands-On Data Preprocessing in Python

Whether you're a data analyst new to programming or already familiar with it, this book will teach you the optimum techniques for data preprocessing from both technical and analytical perspectives. You'll explore the world of advanced data manipulation and preprocessing techniques to create successful data analytic solutions.

Offered byPackt Logo

Difficulty Level

Intermediate

Completion Time

20h4m

Language

English

About Book

Who Is This Book For?

This book is for junior and senior data analysts, business intelligence professionals, engineering undergraduates, and data enthusiasts looking to perform preprocessing and data cleaning on large amounts of data. You don’t need any prior experience with data preprocessing to get started with this book. However, basic programming skills, such as working with variables, conditionals, and loops, along with beginner-level knowledge of Python and simple analytics experience, are a prerequisite.

Book content

chapters 20h4m total length

Review of the Core Modules of NumPy and Pandas

Review of Another Core Module - Matplotlib

Data – What Is It Really?

Databases

Data Visualization

Prediction

Classification

Clustering Analysis

Data Cleaning Level I - Cleaning Up the Table

Data Cleaning Level II - Unpacking, Restructuring, and Reformulating the Table

Data Cleaning Level III- Missing Values, Outliers, and Errors

Data Fusion and Data Integration

Data Reduction

Data Transformation and Massaging

Case Study 1 - Mental Health in Tech

Case Study 2 - Predicting COVID-19 Hospitalizations

Case Study 3: United States Counties Clustering Analysis

Summary, Practice Case Studies, and Conclusions

Related Resources

Access Ready-to-Use Books for Free!

Get instant access to a library of pre-built books—free trial, no credit card required. Start training your team in minutes!

No credit card required