Mastering Apache Spark 2.x
Apache Spark is an in-memory cluster-based parallel processing system that provides a wide range of functionality like graph processing, machine learning, stream processing and more. This book will familiarize you with the newest features in Apache Spark 2.x, and take you through an exciting journey of complex Big Data processing, analytics, streaming analytics as well as advanced machine learning with Apache Spark. During the course of the book, you will leverage different functionalities and modules of Apache Spark such as Spark SQL, Spark MLlib, Spark Streaming, SparkML and more, to build efficient data processing solutions. By the end of this book, you will have all the necessary knowledge to use Apache Spark effectively in your day to day tasks.
Offered by
Difficulty Level
Intermediate
Completion Time
11h48m
Language
English
About Book
Who Is This Book For?
If you are an intermediate-level Spark developer looking to master the advanced capabilities and use-cases of Apache Spark 2.x, this book is for you. Big Data professionals who wish to know how to integrate and use the features of Apache Spark to build a strong Big Data pipeline will also find this book to be a useful resource. A fundamental knowledge of Apache Spark and the Scala programming language is assumed.
Mastering Apache Spark 2.x
- About Book
- Who Is This Book For?
- Book Content
Book content
chapters • 11h48m total length
A first taste and what’s new in ApacheSpark V2
Apache Spark SQL
The Catalyst Optimizer
Project Tungsten
Apache Spark Streaming
Structured Streaming
Apache Spark MLlib
Apache SparkML
Apache SystemML
DeepLearning on Apache Spark with DeepLearning4J, ApacheSystemML,H2O
Apache Spark GraphX
ApacheSpark GraphFrames
ApacheSpark with Jupyter Notebooks on IBM DataScience Experience
ApacheSpark on Kubernetes
Related Resources
Access Ready-to-Use Books for Free!
Get instant access to a library of pre-built books—free trial, no credit card required. Start training your team in minutes!