Book

Cleaning Data for Effective Data Science

Data in its raw state is rarely ready for productive analysis. This book not only teaches you data preparation, but also what questions you should ask of your data. It focuses on the thought processes necessary for successful data cleaning as much as on concise and precise code examples that express these thoughts.

Offered byPackt Logo

Difficulty Level

Intermediate

Completion Time

16h36m

Language

English

About Book

Who Is This Book For?

This book is designed to benefit software developers, data scientists, aspiring data scientists, teachers, and students who work with data. If you want to improve your rigor in data hygiene or are looking for a refresher, this book is for you. Basic familiarity with statistics, general concepts in machine learning, knowledge of a programming language (Python or R), and some exposure to data science are helpful.

Book content

chapters 16h36m total length

Data Ingestion – Tabular Formats

Data Ingestion - Hierarchical Formats

Data Ingestion - Repurposing Data Sources

The Vicissitudes of Error - Anomaly Detection

The Vicissitudes of Error - Data Quality

Rectification and Creation - Value Imputation

Rectification and Creation - Feature Engineering

Ancillary Matters - Closure/Glossary

Related Resources

Access Ready-to-Use Books for Free!

Get instant access to a library of pre-built books—free trial, no credit card required. Start training your team in minutes!

No credit card required