Book

Mastering Apache Spark 2.x

Apache Spark is an in-memory cluster-based parallel processing system that provides a wide range of functionality like graph processing, machine learning, stream processing and more. This book will familiarize you with the newest features in Apache Spark 2.x, and take you through an exciting journey of complex Big Data processing, analytics, streaming analytics as well as advanced machine learning with Apache Spark. During the course of the book, you will leverage different functionalities and modules of Apache Spark such as Spark SQL, Spark MLlib, Spark Streaming, SparkML and more, to build efficient data processing solutions. By the end of this book, you will have all the necessary knowledge to use Apache Spark effectively in your day to day tasks.

Offered by

Difficulty Level

Intermediate

Completion Time

11h48m approx.

Language

English

Certification

Not available

About Course

Book Content

chapters • 11h48m total length

1. A first taste and what’s new in ApacheSpark V2

2. Apache Spark SQL

3. The Catalyst Optimizer

4. Project Tungsten

5. Apache Spark Streaming

6. Structured Streaming

7. Apache Spark MLlib

8. Apache SparkML

9. Apache SystemML

10. DeepLearning on Apache Spark with DeepLearning4J, ApacheSystemML,H2O

11. Apache Spark GraphX

12. ApacheSpark GraphFrames

13. ApacheSpark with Jupyter Notebooks on IBM DataScience Experience

14. ApacheSpark on Kubernetes

On this page

Ready to Train Your Team?

Need training for your whole team? Get bulk pricing, LMS integration, and dedicated support.

Trusted by Leading Organizations Worldwide

Join thousands of companies that trust Calibr to power their learning and development initiatives.

Request Access For Your Organization

Start training your team in minutes!

Related Resources