Book

Learning PySpark

This book will get you to grips with the Spark Python API. You’ll explore how Python can be used with Spark to build scalable and reliable data-intensive applications.

Offered byPackt Logo

Difficulty Level

Intermediate

Completion Time

9h8m

Language

English

About Book

Who Is This Book For?

If you are a Python developer who wants to learn about the Apache Spark 2.0 ecosystem, this book is for you. A firm understanding of Python is expected to get the best out of the book. Familiarity with Spark would be useful, but is not mandatory.

Book content

chapters 9h8m total length

Understanding Spark

Installing Spark

Resilient Distributed Datasets

DataFrames

Prepare Data for Modeling

Introducing MLlib

Introducing the ML Package

GraphFrames

TensorFrames

Polyglot Persistence with Blaze

Structured Streaming

Free Spark Cloud Offering

Packaging Spark Applications

Related Resources

Access Ready-to-Use Books for Free!

Get instant access to a library of pre-built books—free trial, no credit card required. Start training your team in minutes!

No credit card required