Book

Apache Spark 2.x Cookbook

Apache Spark has become the hottest platform and sought after skill set when it comes to the fields of Big Data, Analytics and Data Science. Apache Spark 2.x comes with series of new improvements in the areas of performance, scalability, operational and production readiness for structured processing of massive datasets. This book brings in a systematic way of getting a practical hands on to using its improved programming APIs, expanded SQL functionalities and implement distributed machine learning applications with Spark ML. Through the course of chapters, you will have explored the power of Spark DataFrames/Datasets, harness MLLib for Data mining, analyze complex problems with iterative or multi-stage Spark scripts and other associated toolsets such as Spark SQL, Spark Streaming and GraphX .

Offered by

Difficulty Level

Intermediate

Completion Time

9h48m approx.

Language

English

Certification

Not available

About Course

Book Content

chapters • 9h48m total length

1. Getting Started with Apache Spark

2. Developing Applications with Spark

3. Spark SQL

4. Working with External Data Sources

5. Spark Streaming

6. Getting Started with Machine Learning

7. Supervised Learning with MLlib – Regression

8. Supervised Learning with MLlib – Classification

9. Unsupervised learning

10. Recommendations Using Collaborative Filtering

11. Graph Processing Using GraphX and GraphFrames

12. Optimizations and Performance Tuning

On this page

Ready to Train Your Team?

Need training for your whole team? Get bulk pricing, LMS integration, and dedicated support.

Trusted by Leading Organizations Worldwide

Join thousands of companies that trust Calibr to power their learning and development initiatives.

Request Access For Your Organization

Start training your team in minutes!

Related Resources