
Scala and Spark for Big Data Analytics

Over the last few years, Scala has been adopted increasingly, especially in the field of data science and analytics, along with Apache Spark, which is built on Scala and is widely used in the field of analytics. With this book, you’ll learn how to leverage the power of both Scala and Spark to make sense of big data.

Offered byPackt Logo

Difficulty Level


Completion Time




About Book

Who Is This Book For?

Anyone who wishes to learn how to perform data analysis by harnessing the power of Spark will find this book extremely useful. No knowledge of Spark or Scala is assumed, although prior programming experience (especially with other JVM languages) will be useful to pick up concepts quicker.

Book content

chapters 26h32m total length

Introduction to Scala

Object-oriented Scala

Functional programming concepts

Collections API

Tackle Big Data Spark comes to the party

Start Working with Spark REPL and RDDs

Special RDD Operations

Introduce a Little Structure SparkSQL

Stream Me Up Scotty: Spark Streaming

Everything is Connected GraphX

Learning Machine Learning Spark Mllib

Advanced Machine Learning Best Practices

My Name is Bayes, Naive Bayes

Time to Put Some Order Cluster Your Data with Spark Mllib

Text Analytics using Spark ML

Spark Tuning

Time to Go to ClusterLand Deploy Spark on a Cluster

Testing and Debugging Spark

PySpark and SparkR

Appendix A - Accelerating Spark with Alluxio

Appendix B - Interactive Data Analytics with Apache Zepplin

Related Resources

Access Ready-to-Use Books for Free!

Get instant access to a library of pre-built books—free trial, no credit card required. Start training your team in minutes!

No credit card required