Job Description Generator
Developer-approved job descriptions that attract top talent. Select your hiring need below to get a customizable template.
I want to hire
Job Description for: for Apache Airflow Experience
Customize the template below, then copy or download
# Data Engineer - Apache Airflow
Location: Austin, TX (Hybrid) · Employment Type: Full-time · Level: Mid-Senior
About [Company]
[Company] is building the analytics infrastructure for subscription businesses. Our platform helps 400+ SaaS companies understand revenue metrics, churn patterns, and customer lifetime value. We process 50TB of transaction data monthly, transforming raw payment events into actionable business intelligence.
We're a 52-person team backed by $35M in Series B funding from Bessemer and Index Ventures. Our data pipelines power dashboards used by finance teams and executives at companies ranging from seed-stage startups to public companies. When our pipelines are late, CFOs notice.
Why join [Company]?
- Build data infrastructure that directly impacts business decisions
- Work with modern data stack (Airflow, Snowflake, dbt, Fivetran)
- Solve real orchestration challenges at scale (200+ DAGs, strict SLAs)
- Competitive compensation with meaningful equity
The Role
We're looking for a Data Engineer to own and evolve our Airflow-based data platform. You'll design DAGs that orchestrate complex transformation workflows, ensure data freshness for time-sensitive reports, and build the reliability patterns that let our customers trust their metrics.
You'll report to our Director of Data Engineering and work alongside 4 other data engineers, 2 analytics engineers, and 3 data analysts. Your focus will be on pipeline reliability, performance optimization, and scaling our orchestration platform as we grow.
The problem you'll solve:
Our Airflow platform runs 200+ DAGs processing data from 400+ customer accounts. As we've scaled, pipeline failures have increased to 4% of daily runs, and some critical DAGs take 3+ hours when they should complete in under 1. We need someone to redesign our DAG architecture for reliability and performance.
What This Role IS
- DAG development and optimization—building efficient, reliable workflows for complex data transformations
- Pipeline architecture—designing patterns for dependency management, error handling, and retries
- Performance tuning—optimizing slow pipelines and reducing resource costs
- Monitoring and alerting—building observability into every pipeline
- Platform improvement—enhancing our Airflow deployment for better developer experience
- Cross-team collaboration—working with analysts and product to understand data needs
What This Role is NOT
- Analytics engineering—we have dedicated dbt developers for transformation logic
- Data science or ML—no model training or statistical analysis
- Pure infrastructure/DevOps—we use Cloud Composer (managed Airflow)
- Data visualization—our analysts handle dashboards and reporting
Objectives of This Role
- Reduce DAG failure rate from 4% to under 1%
- Cut P95 execution time for critical DAGs from 3 hours to under 45 minutes
- Implement SLA monitoring that alerts before deadlines are missed
- Build custom operators for common integration patterns (reducing boilerplate by 40%)
- Establish DAG development standards and documentation for the team
Responsibilities
- Design and implement complex DAGs using Apache Airflow best practices
- Build robust error handling with intelligent retries and graceful degradation
- Create custom operators and hooks for team-specific integration needs
- Optimize DAG performance through parallelization, resource tuning, and efficient scheduling
- Implement comprehensive monitoring using Datadog and custom alerting
- Participate in on-call rotation for data infrastructure (1 week every 6 weeks)
- Collaborate with analytics engineers on transformation orchestration
- Document DAG patterns, debugging procedures, and operational runbooks
- Review DAG code and mentor team members on Airflow best practices
- Evaluate new Airflow features and recommend adoption when appropriate
Required Skills and Qualifications
Data Engineering Foundation (Required First):
- 3+ years of professional data engineering experience
- Strong Python skills with experience writing production-quality code
- Deep understanding of SQL and data modeling concepts
- Experience debugging complex distributed systems
- You think about failure modes, not just happy paths
- 1+ years of hands-on Apache Airflow experience in production environments
- Understanding of DAG design patterns, dependencies, and scheduling
- Experience with error handling, retries, and alerting strategies
- Familiarity with custom operator development
- Experience with monitoring and observability for data pipelines
- Comfortable with on-call responsibilities and incident response
- Ability to optimize for both reliability and cost
Preferred Skills and Qualifications
- Experience with managed Airflow (Cloud Composer, MWAA, Astronomer)
- Familiarity with dbt and transformation orchestration patterns
- Knowledge of data quality frameworks (Great Expectations, dbt tests)
- Experience with infrastructure as code (Terraform)
- Background in SaaS, fintech, or subscription business data
- Contributions to Airflow community (operators, documentation, issues)
Tech Stack
Orchestration:
- Apache Airflow 2.7 on Google Cloud Composer
- 200+ DAGs, 15,000+ task instances daily
- Custom operators for Snowflake, Fivetran, dbt
- Snowflake (Enterprise tier)
- 50TB raw data, 400+ customer accounts
- dbt for transformations (500+ models)
- Fivetran for source data ingestion
- Custom Python scripts for proprietary sources
- Webhooks and event-driven triggers
- Datadog for metrics and alerting
- Custom Slack integrations for DAG status
- PagerDuty for on-call escalation
- Google Cloud Platform (GCS, BigQuery, Pub/Sub)
- Terraform for infrastructure management
- GitHub Actions for CI/CD
Pipeline Metrics
- Daily DAG Runs: 2,500+
- Task Instances/Day: 15,000+
- Current Failure Rate: 4% (target: <1%)
- Critical DAG P95 Runtime: 3 hours (target: 45 minutes)
- SLA Compliance: 92% (target: 99%)
- Monthly Compute Costs: $28K (target: $22K)
Compensation and Benefits
Salary: $155,000 - $195,000 (based on experience)
Equity: 0.04% - 0.10% (4-year vest, 1-year cliff)
Benefits:
- Medical, dental, and vision insurance (100% employee, 80% dependents)
- Unlimited PTO with 15-day minimum encouraged
- $4,000 annual learning budget (conferences, courses, certifications)
- $2,000 home office setup allowance
- 401(k) with 4% company match
- 12 weeks paid parental leave
- Flexible hybrid work (2 days in Austin office)
Interview Process
Our interview process typically takes 2-3 weeks and focuses on real data engineering skills.
- Step 1: Application Review (3-5 days) — We review your resume and any relevant projects
- Step 2: Recruiter Screen (30 min) — Background, interests, and logistics
- Step 3: Technical Screen (45 min) — Airflow concepts, DAG design, past projects
- Step 4: Take-Home Exercise (2-3 hours) — Debug and improve a DAG with several issues
- Step 5: Exercise Review (45 min) — Discuss your approach, trade-offs, and alternatives
- Step 6: System Design (60 min) — Design orchestration for a realistic data pipeline scenario
- Step 7: Team Interviews (2 x 30 min) — Meet potential teammates
- Step 8: Hiring Manager (30 min) — Career goals and offer discussion
How to Apply
Submit your resume. We'd especially love to see examples of DAGs you've built, blog posts about Airflow, or contributions to open-source data projects.
[Company] is an equal opportunity employer. We evaluate candidates based on skills and potential, not pedigree.
# Data Engineer - Apache Airflow **Location:** Austin, TX (Hybrid) · **Employment Type:** Full-time · **Level:** Mid-Senior ## About [Company] [Company] is building the analytics infrastructure for subscription businesses. Our platform helps 400+ SaaS companies understand revenue metrics, churn patterns, and customer lifetime value. We process 50TB of transaction data monthly, transforming raw payment events into actionable business intelligence. We're a 52-person team backed by $35M in Series B funding from Bessemer and Index Ventures. Our data pipelines power dashboards used by finance teams and executives at companies ranging from seed-stage startups to public companies. When our pipelines are late, CFOs notice. **Why join [Company]?** - Build data infrastructure that directly impacts business decisions - Work with modern data stack (Airflow, Snowflake, dbt, Fivetran) - Solve real orchestration challenges at scale (200+ DAGs, strict SLAs) - Competitive compensation with meaningful equity ## The Role We're looking for a Data Engineer to own and evolve our Airflow-based data platform. You'll design DAGs that orchestrate complex transformation workflows, ensure data freshness for time-sensitive reports, and build the reliability patterns that let our customers trust their metrics. You'll report to our Director of Data Engineering and work alongside 4 other data engineers, 2 analytics engineers, and 3 data analysts. Your focus will be on pipeline reliability, performance optimization, and scaling our orchestration platform as we grow. **The problem you'll solve:** Our Airflow platform runs 200+ DAGs processing data from 400+ customer accounts. As we've scaled, pipeline failures have increased to 4% of daily runs, and some critical DAGs take 3+ hours when they should complete in under 1. We need someone to redesign our DAG architecture for reliability and performance. ## What This Role IS - **DAG development and optimization**—building efficient, reliable workflows for complex data transformations - **Pipeline architecture**—designing patterns for dependency management, error handling, and retries - **Performance tuning**—optimizing slow pipelines and reducing resource costs - **Monitoring and alerting**—building observability into every pipeline - **Platform improvement**—enhancing our Airflow deployment for better developer experience - **Cross-team collaboration**—working with analysts and product to understand data needs ## What This Role is NOT - **Analytics engineering**—we have dedicated dbt developers for transformation logic - **Data science or ML**—no model training or statistical analysis - **Pure infrastructure/DevOps**—we use Cloud Composer (managed Airflow) - **Data visualization**—our analysts handle dashboards and reporting *If you want to focus on orchestration, reliability, and pipeline architecture rather than SQL transformations or ML, this is the role.* ## Objectives of This Role - Reduce DAG failure rate from 4% to under 1% - Cut P95 execution time for critical DAGs from 3 hours to under 45 minutes - Implement SLA monitoring that alerts before deadlines are missed - Build custom operators for common integration patterns (reducing boilerplate by 40%) - Establish DAG development standards and documentation for the team ## Responsibilities - Design and implement complex DAGs using Apache Airflow best practices - Build robust error handling with intelligent retries and graceful degradation - Create custom operators and hooks for team-specific integration needs - Optimize DAG performance through parallelization, resource tuning, and efficient scheduling - Implement comprehensive monitoring using Datadog and custom alerting - Participate in on-call rotation for data infrastructure (1 week every 6 weeks) - Collaborate with analytics engineers on transformation orchestration - Document DAG patterns, debugging procedures, and operational runbooks - Review DAG code and mentor team members on Airflow best practices - Evaluate new Airflow features and recommend adoption when appropriate ## Required Skills and Qualifications **Data Engineering Foundation (Required First):** - 3+ years of professional data engineering experience - Strong Python skills with experience writing production-quality code - Deep understanding of SQL and data modeling concepts - Experience debugging complex distributed systems - You think about failure modes, not just happy paths **Airflow Experience:** - 1+ years of hands-on Apache Airflow experience in production environments - Understanding of DAG design patterns, dependencies, and scheduling - Experience with error handling, retries, and alerting strategies - Familiarity with custom operator development **Operational Mindset:** - Experience with monitoring and observability for data pipelines - Comfortable with on-call responsibilities and incident response - Ability to optimize for both reliability and cost ## Preferred Skills and Qualifications - Experience with managed Airflow (Cloud Composer, MWAA, Astronomer) - Familiarity with dbt and transformation orchestration patterns - Knowledge of data quality frameworks (Great Expectations, dbt tests) - Experience with infrastructure as code (Terraform) - Background in SaaS, fintech, or subscription business data - Contributions to Airflow community (operators, documentation, issues) ## Tech Stack **Orchestration:** - Apache Airflow 2.7 on Google Cloud Composer - 200+ DAGs, 15,000+ task instances daily - Custom operators for Snowflake, Fivetran, dbt **Data Warehouse:** - Snowflake (Enterprise tier) - 50TB raw data, 400+ customer accounts - dbt for transformations (500+ models) **Data Integration:** - Fivetran for source data ingestion - Custom Python scripts for proprietary sources - Webhooks and event-driven triggers **Monitoring:** - Datadog for metrics and alerting - Custom Slack integrations for DAG status - PagerDuty for on-call escalation **Infrastructure:** - Google Cloud Platform (GCS, BigQuery, Pub/Sub) - Terraform for infrastructure management - GitHub Actions for CI/CD ## Pipeline Metrics - **Daily DAG Runs:** 2,500+ - **Task Instances/Day:** 15,000+ - **Current Failure Rate:** 4% (target: <1%) - **Critical DAG P95 Runtime:** 3 hours (target: 45 minutes) - **SLA Compliance:** 92% (target: 99%) - **Monthly Compute Costs:** $28K (target: $22K) ## Compensation and Benefits **Salary:** $155,000 - $195,000 (based on experience) **Equity:** 0.04% - 0.10% (4-year vest, 1-year cliff) **Benefits:** - Medical, dental, and vision insurance (100% employee, 80% dependents) - Unlimited PTO with 15-day minimum encouraged - $4,000 annual learning budget (conferences, courses, certifications) - $2,000 home office setup allowance - 401(k) with 4% company match - 12 weeks paid parental leave - Flexible hybrid work (2 days in Austin office) ## Interview Process Our interview process typically takes 2-3 weeks and focuses on real data engineering skills. - **Step 1: Application Review** (3-5 days) — We review your resume and any relevant projects - **Step 2: Recruiter Screen** (30 min) — Background, interests, and logistics - **Step 3: Technical Screen** (45 min) — Airflow concepts, DAG design, past projects - **Step 4: Take-Home Exercise** (2-3 hours) — Debug and improve a DAG with several issues - **Step 5: Exercise Review** (45 min) — Discuss your approach, trade-offs, and alternatives - **Step 6: System Design** (60 min) — Design orchestration for a realistic data pipeline scenario - **Step 7: Team Interviews** (2 x 30 min) — Meet potential teammates - **Step 8: Hiring Manager** (30 min) — Career goals and offer discussion No LeetCode. We evaluate how you think about orchestration problems and build reliable systems. ## How to Apply Submit your resume. We'd especially love to see examples of DAGs you've built, blog posts about Airflow, or contributions to open-source data projects. [Company] is an equal opportunity employer. We evaluate candidates based on skills and potential, not pedigree.
JD Tips
- Include specific metrics (DAG count, failure rates, SLAs) to show production reality
- Be honest about on-call expectations—data engineers expect it
- Accept Dagster/Prefect experience—orchestration concepts transfer
- Describe the take-home exercise clearly—debugging a DAG is realistic and respectful
- Mention your managed Airflow platform—it affects the day-to-day work significantly
Complete your hiring toolkit
The best teams don't wait.
They're already here.
Today, it's your turn.