Skip to main content

Job Description Generator

Developer-approved job descriptions that attract top talent. Select your hiring need below to get a customizable template.

I want to hire

Job Description for: for Apache Airflow Experience

Customize the template below, then copy or download

Replace [PLACEHOLDERS] with your company's details
[Company]

# Data Engineer - Apache Airflow

Location: Austin, TX (Hybrid) · Employment Type: Full-time · Level: Mid-Senior

About [Company]

[Company] is building the analytics infrastructure for subscription businesses. Our platform helps 400+ SaaS companies understand revenue metrics, churn patterns, and customer lifetime value. We process 50TB of transaction data monthly, transforming raw payment events into actionable business intelligence.

We're a 52-person team backed by $35M in Series B funding from Bessemer and Index Ventures. Our data pipelines power dashboards used by finance teams and executives at companies ranging from seed-stage startups to public companies. When our pipelines are late, CFOs notice.

Why join [Company]?

  • Build data infrastructure that directly impacts business decisions
  • Work with modern data stack (Airflow, Snowflake, dbt, Fivetran)
  • Solve real orchestration challenges at scale (200+ DAGs, strict SLAs)
  • Competitive compensation with meaningful equity

The Role

We're looking for a Data Engineer to own and evolve our Airflow-based data platform. You'll design DAGs that orchestrate complex transformation workflows, ensure data freshness for time-sensitive reports, and build the reliability patterns that let our customers trust their metrics.

You'll report to our Director of Data Engineering and work alongside 4 other data engineers, 2 analytics engineers, and 3 data analysts. Your focus will be on pipeline reliability, performance optimization, and scaling our orchestration platform as we grow.

The problem you'll solve:

Our Airflow platform runs 200+ DAGs processing data from 400+ customer accounts. As we've scaled, pipeline failures have increased to 4% of daily runs, and some critical DAGs take 3+ hours when they should complete in under 1. We need someone to redesign our DAG architecture for reliability and performance.

What This Role IS

  • DAG development and optimization—building efficient, reliable workflows for complex data transformations
  • Pipeline architecture—designing patterns for dependency management, error handling, and retries
  • Performance tuning—optimizing slow pipelines and reducing resource costs
  • Monitoring and alerting—building observability into every pipeline
  • Platform improvement—enhancing our Airflow deployment for better developer experience
  • Cross-team collaboration—working with analysts and product to understand data needs

What This Role is NOT

  • Analytics engineering—we have dedicated dbt developers for transformation logic
  • Data science or ML—no model training or statistical analysis
  • Pure infrastructure/DevOps—we use Cloud Composer (managed Airflow)
  • Data visualization—our analysts handle dashboards and reporting
*If you want to focus on orchestration, reliability, and pipeline architecture rather than SQL transformations or ML, this is the role.*

Objectives of This Role

  • Reduce DAG failure rate from 4% to under 1%
  • Cut P95 execution time for critical DAGs from 3 hours to under 45 minutes
  • Implement SLA monitoring that alerts before deadlines are missed
  • Build custom operators for common integration patterns (reducing boilerplate by 40%)
  • Establish DAG development standards and documentation for the team

Responsibilities

  • Design and implement complex DAGs using Apache Airflow best practices
  • Build robust error handling with intelligent retries and graceful degradation
  • Create custom operators and hooks for team-specific integration needs
  • Optimize DAG performance through parallelization, resource tuning, and efficient scheduling
  • Implement comprehensive monitoring using Datadog and custom alerting
  • Participate in on-call rotation for data infrastructure (1 week every 6 weeks)
  • Collaborate with analytics engineers on transformation orchestration
  • Document DAG patterns, debugging procedures, and operational runbooks
  • Review DAG code and mentor team members on Airflow best practices
  • Evaluate new Airflow features and recommend adoption when appropriate

Required Skills and Qualifications

Data Engineering Foundation (Required First):

  • 3+ years of professional data engineering experience
  • Strong Python skills with experience writing production-quality code
  • Deep understanding of SQL and data modeling concepts
  • Experience debugging complex distributed systems
  • You think about failure modes, not just happy paths
Airflow Experience:
  • 1+ years of hands-on Apache Airflow experience in production environments
  • Understanding of DAG design patterns, dependencies, and scheduling
  • Experience with error handling, retries, and alerting strategies
  • Familiarity with custom operator development
Operational Mindset:
  • Experience with monitoring and observability for data pipelines
  • Comfortable with on-call responsibilities and incident response
  • Ability to optimize for both reliability and cost

Preferred Skills and Qualifications

  • Experience with managed Airflow (Cloud Composer, MWAA, Astronomer)
  • Familiarity with dbt and transformation orchestration patterns
  • Knowledge of data quality frameworks (Great Expectations, dbt tests)
  • Experience with infrastructure as code (Terraform)
  • Background in SaaS, fintech, or subscription business data
  • Contributions to Airflow community (operators, documentation, issues)

Tech Stack

Orchestration:

  • Apache Airflow 2.7 on Google Cloud Composer
  • 200+ DAGs, 15,000+ task instances daily
  • Custom operators for Snowflake, Fivetran, dbt
Data Warehouse:
  • Snowflake (Enterprise tier)
  • 50TB raw data, 400+ customer accounts
  • dbt for transformations (500+ models)
Data Integration:
  • Fivetran for source data ingestion
  • Custom Python scripts for proprietary sources
  • Webhooks and event-driven triggers
Monitoring:
  • Datadog for metrics and alerting
  • Custom Slack integrations for DAG status
  • PagerDuty for on-call escalation
Infrastructure:
  • Google Cloud Platform (GCS, BigQuery, Pub/Sub)
  • Terraform for infrastructure management
  • GitHub Actions for CI/CD

Pipeline Metrics

  • Daily DAG Runs: 2,500+
  • Task Instances/Day: 15,000+
  • Current Failure Rate: 4% (target: <1%)
  • Critical DAG P95 Runtime: 3 hours (target: 45 minutes)
  • SLA Compliance: 92% (target: 99%)
  • Monthly Compute Costs: $28K (target: $22K)

Compensation and Benefits

Salary: $155,000 - $195,000 (based on experience)

Equity: 0.04% - 0.10% (4-year vest, 1-year cliff)

Benefits:

  • Medical, dental, and vision insurance (100% employee, 80% dependents)
  • Unlimited PTO with 15-day minimum encouraged
  • $4,000 annual learning budget (conferences, courses, certifications)
  • $2,000 home office setup allowance
  • 401(k) with 4% company match
  • 12 weeks paid parental leave
  • Flexible hybrid work (2 days in Austin office)

Interview Process

Our interview process typically takes 2-3 weeks and focuses on real data engineering skills.

  • Step 1: Application Review (3-5 days) — We review your resume and any relevant projects
  • Step 2: Recruiter Screen (30 min) — Background, interests, and logistics
  • Step 3: Technical Screen (45 min) — Airflow concepts, DAG design, past projects
  • Step 4: Take-Home Exercise (2-3 hours) — Debug and improve a DAG with several issues
  • Step 5: Exercise Review (45 min) — Discuss your approach, trade-offs, and alternatives
  • Step 6: System Design (60 min) — Design orchestration for a realistic data pipeline scenario
  • Step 7: Team Interviews (2 x 30 min) — Meet potential teammates
  • Step 8: Hiring Manager (30 min) — Career goals and offer discussion
No LeetCode. We evaluate how you think about orchestration problems and build reliable systems.

How to Apply

Submit your resume. We'd especially love to see examples of DAGs you've built, blog posts about Airflow, or contributions to open-source data projects.

[Company] is an equal opportunity employer. We evaluate candidates based on skills and potential, not pedigree.

JD Tips
  • Include specific metrics (DAG count, failure rates, SLAs) to show production reality
  • Be honest about on-call expectations—data engineers expect it
  • Accept Dagster/Prefect experience—orchestration concepts transfer
  • Describe the take-home exercise clearly—debugging a DAG is realistic and respectful
  • Mention your managed Airflow platform—it affects the day-to-day work significantly
Join the movement

The best teams don't wait.
They're already here.

Today, it's your turn.