Book

PySpark Cookbook

This cookbook presents recipes on leveraging the power of Python and putting it to use in the Apache Spark ecosystem. By the end of this book, you will be able to solve any problem associated with building effective, data-intensive applications and performing machine learning and structured streaming using PySpark.

Offered byPackt Logo

Difficulty Level

Intermediate

Completion Time

11h

Language

English

About Book

Who Is This Book For?

The PySpark Cookbook is for you if you are a Python developer looking for hands-on recipes for using the Apache Spark 2.x ecosystem in the best possible way. A thorough understanding of Python (and some familiarity with Spark) will help you get the best out of the book.

Book content

chapters 11h total length

Spark installation and configuration

Abstracting data with RDDs

Abstracting data with DataFrames

Preparing data for modeling

Machine Learning with MLLib

Machine Learning with ML module

Structured streaming with PySpark

GraphFrames - Graph Theory with PySpark

Related Resources

Access Ready-to-Use Books for Free!

Get instant access to a library of pre-built books—free trial, no credit card required. Start training your team in minutes!

No credit card required