Book

Data Engineering with Python

This book is a comprehensive introduction to building data pipelines, that will have you moving and transforming data in no time. You'll learn how to build data pipelines, transform and clean data, and deliver it to provide value to users. You will learn to deploy production data pipelines that include logging, monitoring, and version control.

Offered byPackt Logo

Difficulty Level

Intermediate

Completion Time

11h52m

Language

English

About Book

Who Is This Book For?

This book is for data analysts, ETL developers, and anyone looking to get started with or transition to the field of data engineering or refresh their knowledge of data engineering using Python. This book will also be useful for students planning to build a career in data engineering or IT professionals preparing for a transition. No previous knowledge of data engineering is required.

Book content

chapters 11h52m total length

What is Data Engineering?

Building Our Data Engineering Infrastructure

Reading and Writing Files

Working with Databases

Cleaning, Transforming, and Enriching Data

Building a 311 Data Pipeline

Features of a Production Pipeline

Version Control Using the NiFi Registry

Monitoring and Logging Pipelines

Deploying your Pipelines

Building a Production Data Pipeline

Building a Kafka Cluster

Streaming Data with Apache Kafka

Data Processing with Apache Spark

Real-Time Edge Data with MiNiFi, Kafka, and Spark

Appendix

Related Resources

Access Ready-to-Use Books for Free!

Get instant access to a library of pre-built books—free trial, no credit card required. Start training your team in minutes!

No credit card required