Book

Mastering Machine Learning with Spark 2.x

The purpose of machine learning is to build systems that learn from data. With the meteoric rise of machine learning, developers are now keen on finding out how can they make their Spark applications smarter. The book commences by defining machine learning primitives by the MLlib and H2O libraries. You will learn how to use Binary classification to detect the Higgs Boson particle in the huge amount of data produced by CERN particle collider and classify daily health activities using ensemble Methods for Multi-Class Classification. Finally, you will build different pattern mining models using MLlib, perform complex manipulation of DataFrames using Spark and Spark SQL, and deploy your app in a Spark streaming environment.

Offered byPackt Logo

Difficulty Level

Intermediate

Completion Time

11h20m

Language

English

About Book

Who Is This Book For?

Are you a developer with a background in machine learning and statistics who is feeling limited by the current slow and “small data” machine learning tools? Then this is the book for you! In this book, you will create scalable machine learning applications to power a modern data-driven business using Spark. We assume that you already know the machine learning concepts and algorithms and have Spark up and running (whether on a cluster or locally) and have a basic knowledge of the various libraries contained in Spark.

Book content

chapters 11h20m total length

Introduction to Large Scale Machine Learning

Detecting Dark Matter: The Higgs-Boson Particle

Ensemble Methods for Multi-Class Classification

Predicting Movie Reviews using NLP and Spark Streaming

Online Learning with Word2Vec

Extracting Patterns from Clickstream Data

Graph Analytics with GraphX

Lending Club Loan Prediction

Related Resources

Access Ready-to-Use Books for Free!

Get instant access to a library of pre-built books—free trial, no credit card required. Start training your team in minutes!

No credit card required