Apache Spark 2.x Cookbook
Apache Spark has become the hottest platform and sought after skill set when it comes to the fields of Big Data, Analytics and Data Science. Apache Spark 2.x comes with series of new improvements in the areas of performance, scalability, operational and production readiness for structured processing of massive datasets. This book brings in a systematic way of getting a practical hands on to using its improved programming APIs, expanded SQL functionalities and implement distributed machine learning applications with Spark ML. Through the course of chapters, you will have explored the power of Spark DataFrames/Datasets, harness MLLib for Data mining, analyze complex problems with iterative or multi-stage Spark scripts and other associated toolsets such as Spark SQL, Spark Streaming and GraphX .
Offered by
Difficulty Level
Intermediate
Completion Time
9h48m
Language
English
About Book
Who Is This Book For?
This book is for data engineers, data scientists, and Big Data professionals who want to leverage the power of Apache Spark 2.x for real-time Big Data processing. If you’re looking for quick solutions to common problems while using Spark 2.x effectively, this book will also help you. The book assumes you have a basic knowledge of Scala as a programming language.
Apache Spark 2.x Cookbook
- About Book
- Who Is This Book For?
- Book Content
Book content
chapters • 9h48m total length
Getting Started with Apache Spark
Developing Applications with Spark
Spark SQL
Working with External Data Sources
Spark Streaming
Getting Started with Machine Learning
Supervised Learning with MLlib – Regression
Supervised Learning with MLlib – Classification
Unsupervised learning
Recommendations Using Collaborative Filtering
Graph Processing Using GraphX and GraphFrames
Optimizations and Performance Tuning
Related Resources
Access Ready-to-Use Books for Free!
Get instant access to a library of pre-built books—free trial, no credit card required. Start training your team in minutes!