Zero2Dataengineer

Zero2Dataengineer

Zero to DE Daily Lessons

Scheduling vs Triggering

How Workflows Actually Run in Production

Avantikka_Penumarty's avatar
Avantikka_Penumarty
May 22, 2025
∙ Paid

Most people set schedule_interval ='@daily' and move on.
But in production, nothing is that simple.

Data arrives late.
APIs fail.
Files drop into S3 at random.
And your pipeline has to wait, trigger, or backfill — not just run on a timer.

Today, we’re digging into how scheduling really works — and how you should answer when interviewers ask:

“How do you schedule and trigger your Airflow DAGs?”

Zero2Dataengineer is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.


First: Understand the Two Types of Runs

  1. Scheduled Runs
    You tell Airflow to run a DAG every X time:

    • Every hour, day, week, etc.

    • Use cases: batch ETL, daily reporting, metrics updates

  2. Triggered Runs
    Airflow runs the DAG when something happens:

  • A file lands in S3

  • An upstream DAG completes

  • An API returns a signal


Elite Bonus Drop Coming:
This Thursday, I’ll walk through Sensors, ExternalTask dependency patterns, and DAG chaining in a real multi-DAG system.

This is where your interview answers start sounding like a Staff Data Engineer.

UPGRADE TO ANNUAL MEMBERSHIP


This post is for paid subscribers

Already a paid subscriber? Sign in
© 2025 Avantika
Privacy ∙ Terms ∙ Collection notice
Start your SubstackGet the app
Substack is the home for great culture