Data Engineering with Python
This book is a comprehensive introduction to building data pipelines, that will have you moving and transforming data in no time. You'll learn how to build data pipelines, transform and clean data, and deliver it to provide value to users. You will learn to deploy production data pipelines that include logging, monitoring, and version control.
Offered by
Difficulty Level
Intermediate
Completion Time
11h52m
Language
English
About Book
Who Is This Book For?
This book is for data analysts, ETL developers, and anyone looking to get started with or transition to the field of data engineering or refresh their knowledge of data engineering using Python. This book will also be useful for students planning to build a career in data engineering or IT professionals preparing for a transition. No previous knowledge of data engineering is required.
Data Engineering with Python
- About Book
- Who Is This Book For?
- Book Content
Book content
chapters • 11h52m total length
What is Data Engineering?
Building Our Data Engineering Infrastructure
Reading and Writing Files
Working with Databases
Cleaning, Transforming, and Enriching Data
Building a 311 Data Pipeline
Features of a Production Pipeline
Version Control Using the NiFi Registry
Monitoring and Logging Pipelines
Deploying your Pipelines
Building a Production Data Pipeline
Building a Kafka Cluster
Streaming Data with Apache Kafka
Data Processing with Apache Spark
Real-Time Edge Data with MiNiFi, Kafka, and Spark
Appendix
Related Resources
Access Ready-to-Use Books for Free!
Get instant access to a library of pre-built books—free trial, no credit card required. Start training your team in minutes!