In this talk, I’ll walk you through the tricks and best practices to take your data pipeline game to the next level. No boring theory here - we’ll be talking real-world use cases.
Exploring which are the patterns for data pipeline with Airflow+Spark, Airflow+DBT, Airflow+Polars, how to avoid dependencies management on Airflow and resuse DAGs template on our organization.
Define which are the fundamental concepts of a Data Pipeline, from Data Lineage, Data Observability, Metadata, Data quality, Data auditing and how to integrate it on a Data Pipeline.
How to write clean code on our Data Pipeline using Factory Design Pattern with spark-submit , Airflow and KubernatesPodOperator.
Discover alternativies to Airflow in our Data Architecture with Dagster and Mage.