Book

Data Ingestion with Python Cookbook

The Data Ingestion with Python Cookbook presents a collection of practical recipes to help you get started with the process of data ingestion from various sources or data files. This book offers a range of code recipes and solutions for creating efficient data ingestion pipelines while addressing common issues related to it.

Offered byPackt Logo

Difficulty Level

Intermediate

Completion Time

13h48m

Language

English

About Book

Who Is This Book For?

This book is for data engineers and data enthusiasts seeking a comprehensive understanding of the data ingestion process using popular tools in the open source community. For more advanced learners, this book takes on the theoretical pillars of data governance while providing practical examples of real-world scenarios commonly encountered by data engineers.

Book content

chapters 13h48m total length

Introduction to Data Ingestion

Principals of Data Access – Accessing your Data

Data Discovery – Understanding Our Data Before Ingesting It

Reading CSV and JSON Files and Solving Problems

Ingesting Data from Structured and Unstructured Databases

Using PySpark with Defined and Non-Defined Schemas

Ingesting Analytical Data

Designing Monitored Data Workflows

Putting Everything Together with Airflow

Logging and Monitoring Your Data Ingest in Airflow

Automating Your Data Ingestion Pipelines

Using Data Observability for Debugging, Error Handling, and Preventing Downtime

Related Resources

Access Ready-to-Use Books for Free!

Get instant access to a library of pre-built books—free trial, no credit card required. Start training your team in minutes!

No credit card required