Data Cleaning and Exploration with Machine Learning
Data scientists spend 80% of their time cleaning and manipulating data and only 20% of their time analyzing it. Efforts put into cleaning data are crucial, since analyzing dirty data can lead to inaccurate decisions. This is a critically timed book that will help you identify, diagnose, and treat data cleaning problems in Python, with advanced ML techniques.
Offered by
Difficulty Level
Intermediate
Completion Time
18h4m
Language
English
About Book
Who Is This Book For?
This book is for professional data scientists, particularly those in the first few years of their career, or more experienced analysts who are relatively new to machine learning. Readers should have prior knowledge of concepts in statistics typically taught in an undergraduate introductory course as well as beginner-level experience in manipulating data programmatically.
Data Cleaning and Exploration with Machine Learning
- About Book
- Who Is This Book For?
- Book Content
Book content
chapters • 18h4m total length
Examining the Distribution of Features and Targets
Examining Bivariate and Multivariate Relationships between Features and Targets
Identifying and Fixing Missing Values
Encoding, Transforming, and Scaling Features
Feature Selection
Preparing for Model Evaluation
Linear Regression Models
Support Vector Regression
K-Nearest Neighbor, Decision Tree, Random Forest and Gradient Boosted Regression
Logistic Regression
Decision Trees and Random Forest Classification
K-Nearest Neighbors for Classification
Support Vector Machine Classification
Naive Bayes Classification
Principal Component Analysis
K-Means and DBSCAN Clustering
Related Resources
Access Ready-to-Use Books for Free!
Get instant access to a library of pre-built books—free trial, no credit card required. Start training your team in minutes!