Book Content
chapters • 13h48m total length
1. Introduction to Data Ingestion
2. Principals of Data Access – Accessing your Data
3. Data Discovery – Understanding Our Data Before Ingesting It
4. Reading CSV and JSON Files and Solving Problems
5. Ingesting Data from Structured and Unstructured Databases
6. Using PySpark with Defined and Non-Defined Schemas
7. Ingesting Analytical Data
8. Designing Monitored Data Workflows
9. Putting Everything Together with Airflow
10. Logging and Monitoring Your Data Ingest in Airflow
11. Automating Your Data Ingestion Pipelines
12. Using Data Observability for Debugging, Error Handling, and Preventing Downtime














