Book

Mastering Spark for Data Science

Placing the reader in the position of a commercial data scientist, this book covers the key attributes to solve real-world problems in areas such as music, financial markets, and global news. Introducing advanced techniques in Spark, it also comprehensively explores the surrounding eco-system with innovative and scalable solutions throughout

Offered byPackt Logo

Difficulty Level

Intermediate

Completion Time

18h40m

Language

English

About Book

Who Is This Book For?

This book is for those who have beginner-level familiarity with the Spark architecture and data science applications, especially those who are looking for a challenge and want to learn cutting edge techniques. This book assumes working knowledge of data science, common machine learning methods, and popular data science tools, and assumes you have previously run proof of concept studies and built prototypes.

Book content

chapters 18h40m total length

The Big Data Science Ecosystem

Data Acquisition

Input Formats and Schema

Exploratory Data Analysis

Spark for Geographic Analysis

Scraping Link Based External Data

Building Communities

Building a Recommendation System

News Dictionary and Real Time Tagging System

Story Deduplication and Mutation

Anomaly Detection on Sentiment Analysis

TrendCalculus

Secure Data

Scalable Algorithms

Related Resources

Access Ready-to-Use Books for Free!

Get instant access to a library of pre-built books—free trial, no credit card required. Start training your team in minutes!

No credit card required