Day 29/30 SQL, Python, ETL, Data Modelling Challenge FREE Solutions
April 3rd CHALLENGE – unlock solutions + reasoning
👋 Hey Data Engineers!
You’ve made it to Day 29 – and today’s lineup is a strong one. We’re diving into:
Recursive CTEs in SQL
Python’s zip() for iterable pairing
Task dependencies in Airflow
Type 2 Slowly Changing Dimensions in Data Warehousing
SQL Challenge – Recursive CTEs
Challenge Recap:
❓ What is a key use case for a recursive CTE in SQL?
✅ Answer: C - Generating a sequence or hierarchy
Explanation:
Recursive CTEs allow SQL queries to loop through hierarchical data structures like employee-manager relationships or category trees. They include a base case and a recursive clause to build up results row-by-row.
Best Practices:
Always define a clear base condition
Use a
LEVELorDEPTHcolumn to track recursionLimit recursion to avoid infinite loops
Python Challenge – zip() Function
Challenge Recap:
a = [1, 2, 3]
b = ['x', 'y', 'z']
print(list(zip(a, b)))✅ Answer: A - [(1, 'x'), (2, 'y'), (3, 'z')]
Explanation:
The zip() function pairs items from multiple iterables into tuples. It’s especially useful when merging values from two separate lists into one structured format.
Best Practices:
Use
zip()to merge lists of equal lengthConvert zipped results to dicts with
dict(zip(keys, values))Use
itertools.zip_longest()if lists are uneven
ETL Challenge – Task Dependencies in Airflow
Challenge Recap:
❓ In Apache Airflow, which operator sets task execution order?
✅ Answer: C - >> or set_downstream()
Explanation:
In Airflow, >> and << are symbolic operators used to define task execution order. They form the dependency structure that determines how tasks run within a DAG.
Best Practices:
Use
>>and<<for clean and readable DAGsAvoid circular dependencies
Use TaskGroups to organize complex DAGs
Data Modeling Challenge – SCD Type 2
Challenge Recap:
❓ Which SCD type maintains both current and historical data?
✅ Answer: C - Type 2
Explanation:
SCD Type 2 keeps a full history of data changes by creating a new record for each update, typically using effective_from and effective_to date columns.
Best Practices:
Use surrogate keys (not business keys) as primary keys
Ensure only one current record has
effective_to IS NULLUse SCD2 when tracking slowly changing attributes like location, title, etc.
📌 Want access to full deep dives, live runnable code, interview patterns, and bonus mock prep?
💬 Drop your score in the comments. Leaderboard ends tomorrow!

