Job Description Generator

Developer-approved job descriptions that attract top talent. Select your hiring need below to get a customizable template.

I want to hire

Role

Technology

Job Description for: for Apache Airflow Experience

Customize the template below, then copy or download

Replace [PLACEHOLDERS] with your company's details

[Company]

# Data Engineer - Apache Airflow

Location: Austin, TX (Hybrid) · Employment Type: Full-time · Level: Mid-Senior

About [Company]

[Company] is building the analytics infrastructure for subscription businesses. Our platform helps 400+ SaaS companies understand revenue metrics, churn patterns, and customer lifetime value. We process 50TB of transaction data monthly, transforming raw payment events into actionable business intelligence.

We're a 52-person team backed by $35M in Series B funding from Bessemer and Index Ventures. Our data pipelines power dashboards used by finance teams and executives at companies ranging from seed-stage startups to public companies. When our pipelines are late, CFOs notice.

Why join [Company]?

Build data infrastructure that directly impacts business decisions
Work with modern data stack (Airflow, Snowflake, dbt, Fivetran)
Solve real orchestration challenges at scale (200+ DAGs, strict SLAs)
Competitive compensation with meaningful equity

The Role

We're looking for a Data Engineer to own and evolve our Airflow-based data platform. You'll design DAGs that orchestrate complex transformation workflows, ensure data freshness for time-sensitive reports, and build the reliability patterns that let our customers trust their metrics.

You'll report to our Director of Data Engineering and work alongside 4 other data engineers, 2 analytics engineers, and 3 data analysts. Your focus will be on pipeline reliability, performance optimization, and scaling our orchestration platform as we grow.

The problem you'll solve:

Our Airflow platform runs 200+ DAGs processing data from 400+ customer accounts. As we've scaled, pipeline failures have increased to 4% of daily runs, and some critical DAGs take 3+ hours when they should complete in under 1. We need someone to redesign our DAG architecture for reliability and performance.

What This Role IS

DAG development and optimization-building efficient, reliable workflows for complex data transformations
Pipeline architecture-designing patterns for dependency management, error handling, and retries
Performance tuning-optimizing slow pipelines and reducing resource costs
Monitoring and alerting-building observability into every pipeline
Platform improvement-enhancing our Airflow deployment for better developer experience
Cross-team collaboration-working with analysts and product to understand data needs

What This Role is NOT

Analytics engineering-we have dedicated dbt developers for transformation logic
Data science or ML-no model training or statistical analysis
Pure infrastructure/DevOps-we use Cloud Composer (managed Airflow)
Data visualization-our analysts handle dashboards and reporting

*If you want to focus on orchestration, reliability, and pipeline architecture rather than SQL transformations or ML, this is the role.*

Objectives of This Role

Reduce DAG failure rate from 4% to under 1%
Cut P95 execution time for critical DAGs from 3 hours to under 45 minutes
Implement SLA monitoring that alerts before deadlines are missed
Build custom operators for common integration patterns (reducing boilerplate by 40%)
Establish DAG development standards and documentation for the team

Responsibilities

Design and implement complex DAGs using Apache Airflow best practices
Build robust error handling with intelligent retries and graceful degradation
Create custom operators and hooks for team-specific integration needs
Optimize DAG performance through parallelization, resource tuning, and efficient scheduling
Implement comprehensive monitoring using Datadog and custom alerting
Participate in on-call rotation for data infrastructure (1 week every 6 weeks)
Collaborate with analytics engineers on transformation orchestration
Document DAG patterns, debugging procedures, and operational runbooks
Review DAG code and mentor team members on Airflow best practices
Evaluate new Airflow features and recommend adoption when appropriate

Required Skills and Qualifications

Data Engineering Foundation (Required First):

3+ years of professional data engineering experience
Strong Python skills with experience writing production-quality code
Deep understanding of SQL and data modeling concepts
Experience debugging complex distributed systems
You think about failure modes, not just happy paths

Airflow Experience:

1+ years of hands-on Apache Airflow experience in production environments
Understanding of DAG design patterns, dependencies, and scheduling
Experience with error handling, retries, and alerting strategies
Familiarity with custom operator development

Operational Mindset:

Experience with monitoring and observability for data pipelines
Comfortable with on-call responsibilities and incident response
Ability to optimize for both reliability and cost

Preferred Skills and Qualifications

Experience with managed Airflow (Cloud Composer, MWAA, Astronomer)
Familiarity with dbt and transformation orchestration patterns
Knowledge of data quality frameworks (Great Expectations, dbt tests)
Experience with infrastructure as code (Terraform)
Background in SaaS, fintech, or subscription business data
Contributions to Airflow community (operators, documentation, issues)

Tech Stack

Orchestration:

Apache Airflow 2.7 on Google Cloud Composer
200+ DAGs, 15,000+ task instances daily
Custom operators for Snowflake, Fivetran, dbt

Data Warehouse:

Snowflake (Enterprise tier)
50TB raw data, 400+ customer accounts
dbt for transformations (500+ models)

Data Integration:

Fivetran for source data ingestion
Custom Python scripts for proprietary sources
Webhooks and event-driven triggers

Monitoring:

Datadog for metrics and alerting
Custom Slack integrations for DAG status
PagerDuty for on-call escalation

Infrastructure:

Google Cloud Platform (GCS, BigQuery, Pub/Sub)
Terraform for infrastructure management
GitHub Actions for CI/CD

Pipeline Metrics

Daily DAG Runs: 2,500+
Task Instances/Day: 15,000+
Current Failure Rate: 4% (target: <1%)
Critical DAG P95 Runtime: 3 hours (target: 45 minutes)
SLA Compliance: 92% (target: 99%)
Monthly Compute Costs: $28K (target: $22K)

Compensation and Benefits

Salary: $155,000 - $195,000 (based on experience)

Equity: 0.04% - 0.10% (4-year vest, 1-year cliff)

Benefits:

Medical, dental, and vision insurance (100% employee, 80% dependents)
Unlimited PTO with 15-day minimum encouraged
$4,000 annual learning budget (conferences, courses, certifications)
$2,000 home office setup allowance
401(k) with 4% company match
12 weeks paid parental leave
Flexible hybrid work (2 days in Austin office)

Interview Process

Our interview process typically takes 2-3 weeks and focuses on real data engineering skills.

Step 1: Application Review (3-5 days) - We review your resume and any relevant projects

Step 2: Recruiter Screen (30 min) - Background, interests, and logistics

Step 3: Technical Screen (45 min) - Airflow concepts, DAG design, past projects

Step 4: Take-Home Exercise (2-3 hours) - Debug and improve a DAG with several issues

Step 5: Exercise Review (45 min) - Discuss your approach, trade-offs, and alternatives

Step 6: System Design (60 min) - Design orchestration for a realistic data pipeline scenario

Step 7: Team Interviews (2 x 30 min) - Meet potential teammates

Step 8: Hiring Manager (30 min) - Career goals and offer discussion

No LeetCode. We evaluate how you think about orchestration problems and build reliable systems.

How to Apply

Submit your resume. We'd especially love to see examples of DAGs you've built, blog posts about Airflow, or contributions to open-source data projects.

[Company] is an equal opportunity employer. We evaluate candidates based on skills and potential, not pedigree.

# Data Engineer - Apache Airflow

**Location:** Austin, TX (Hybrid) · **Employment Type:** Full-time · **Level:** Mid-Senior

## About [Company]

[Company] is building the analytics infrastructure for subscription businesses. Our platform helps 400+ SaaS companies understand revenue metrics, churn patterns, and customer lifetime value. We process 50TB of transaction data monthly, transforming raw payment events into actionable business intelligence.

We're a 52-person team backed by $35M in Series B funding from Bessemer and Index Ventures. Our data pipelines power dashboards used by finance teams and executives at companies ranging from seed-stage startups to public companies. When our pipelines are late, CFOs notice.

**Why join [Company]?**

- Build data infrastructure that directly impacts business decisions
- Work with modern data stack (Airflow, Snowflake, dbt, Fivetran)
- Solve real orchestration challenges at scale (200+ DAGs, strict SLAs)
- Competitive compensation with meaningful equity

## The Role

We're looking for a Data Engineer to own and evolve our Airflow-based data platform. You'll design DAGs that orchestrate complex transformation workflows, ensure data freshness for time-sensitive reports, and build the reliability patterns that let our customers trust their metrics.

You'll report to our Director of Data Engineering and work alongside 4 other data engineers, 2 analytics engineers, and 3 data analysts. Your focus will be on pipeline reliability, performance optimization, and scaling our orchestration platform as we grow.

**The problem you'll solve:**

Our Airflow platform runs 200+ DAGs processing data from 400+ customer accounts. As we've scaled, pipeline failures have increased to 4% of daily runs, and some critical DAGs take 3+ hours when they should complete in under 1. We need someone to redesign our DAG architecture for reliability and performance.

## What This Role IS

- **DAG development and optimization**-building efficient, reliable workflows for complex data transformations
- **Pipeline architecture**-designing patterns for dependency management, error handling, and retries
- **Performance tuning**-optimizing slow pipelines and reducing resource costs
- **Monitoring and alerting**-building observability into every pipeline
- **Platform improvement**-enhancing our Airflow deployment for better developer experience
- **Cross-team collaboration**-working with analysts and product to understand data needs

## What This Role is NOT

- **Analytics engineering**-we have dedicated dbt developers for transformation logic
- **Data science or ML**-no model training or statistical analysis
- **Pure infrastructure/DevOps**-we use Cloud Composer (managed Airflow)
- **Data visualization**-our analysts handle dashboards and reporting

*If you want to focus on orchestration, reliability, and pipeline architecture rather than SQL transformations or ML, this is the role.*

## Objectives of This Role

- Reduce DAG failure rate from 4% to under 1%
- Cut P95 execution time for critical DAGs from 3 hours to under 45 minutes
- Implement SLA monitoring that alerts before deadlines are missed
- Build custom operators for common integration patterns (reducing boilerplate by 40%)
- Establish DAG development standards and documentation for the team

## Responsibilities

- Design and implement complex DAGs using Apache Airflow best practices
- Build robust error handling with intelligent retries and graceful degradation
- Create custom operators and hooks for team-specific integration needs
- Optimize DAG performance through parallelization, resource tuning, and efficient scheduling
- Implement comprehensive monitoring using Datadog and custom alerting
- Participate in on-call rotation for data infrastructure (1 week every 6 weeks)
- Collaborate with analytics engineers on transformation orchestration
- Document DAG patterns, debugging procedures, and operational runbooks
- Review DAG code and mentor team members on Airflow best practices
- Evaluate new Airflow features and recommend adoption when appropriate

## Required Skills and Qualifications

**Data Engineering Foundation (Required First):**

- 3+ years of professional data engineering experience
- Strong Python skills with experience writing production-quality code
- Deep understanding of SQL and data modeling concepts
- Experience debugging complex distributed systems
- You think about failure modes, not just happy paths

**Airflow Experience:**

- 1+ years of hands-on Apache Airflow experience in production environments
- Understanding of DAG design patterns, dependencies, and scheduling
- Experience with error handling, retries, and alerting strategies
- Familiarity with custom operator development

**Operational Mindset:**

- Experience with monitoring and observability for data pipelines
- Comfortable with on-call responsibilities and incident response
- Ability to optimize for both reliability and cost

## Preferred Skills and Qualifications

- Experience with managed Airflow (Cloud Composer, MWAA, Astronomer)
- Familiarity with dbt and transformation orchestration patterns
- Knowledge of data quality frameworks (Great Expectations, dbt tests)
- Experience with infrastructure as code (Terraform)
- Background in SaaS, fintech, or subscription business data
- Contributions to Airflow community (operators, documentation, issues)

## Tech Stack

**Orchestration:**
- Apache Airflow 2.7 on Google Cloud Composer
- 200+ DAGs, 15,000+ task instances daily
- Custom operators for Snowflake, Fivetran, dbt

**Data Warehouse:**
- Snowflake (Enterprise tier)
- 50TB raw data, 400+ customer accounts
- dbt for transformations (500+ models)

**Data Integration:**
- Fivetran for source data ingestion
- Custom Python scripts for proprietary sources
- Webhooks and event-driven triggers

**Monitoring:**
- Datadog for metrics and alerting
- Custom Slack integrations for DAG status
- PagerDuty for on-call escalation

**Infrastructure:**
- Google Cloud Platform (GCS, BigQuery, Pub/Sub)
- Terraform for infrastructure management
- GitHub Actions for CI/CD

## Pipeline Metrics

- **Daily DAG Runs:** 2,500+
- **Task Instances/Day:** 15,000+
- **Current Failure Rate:** 4% (target: <1%)
- **Critical DAG P95 Runtime:** 3 hours (target: 45 minutes)
- **SLA Compliance:** 92% (target: 99%)
- **Monthly Compute Costs:** $28K (target: $22K)

## Compensation and Benefits

**Salary:** $155,000 - $195,000 (based on experience)

**Equity:** 0.04% - 0.10% (4-year vest, 1-year cliff)

**Benefits:**

- Medical, dental, and vision insurance (100% employee, 80% dependents)
- Unlimited PTO with 15-day minimum encouraged
- $4,000 annual learning budget (conferences, courses, certifications)
- $2,000 home office setup allowance
- 401(k) with 4% company match
- 12 weeks paid parental leave
- Flexible hybrid work (2 days in Austin office)

## Interview Process

Our interview process typically takes 2-3 weeks and focuses on real data engineering skills.

- **Step 1: Application Review** (3-5 days) - We review your resume and any relevant projects

- **Step 2: Recruiter Screen** (30 min) - Background, interests, and logistics

- **Step 3: Technical Screen** (45 min) - Airflow concepts, DAG design, past projects

- **Step 4: Take-Home Exercise** (2-3 hours) - Debug and improve a DAG with several issues

- **Step 5: Exercise Review** (45 min) - Discuss your approach, trade-offs, and alternatives

- **Step 6: System Design** (60 min) - Design orchestration for a realistic data pipeline scenario

- **Step 7: Team Interviews** (2 x 30 min) - Meet potential teammates

- **Step 8: Hiring Manager** (30 min) - Career goals and offer discussion

No LeetCode. We evaluate how you think about orchestration problems and build reliable systems.

## How to Apply

Submit your resume. We'd especially love to see examples of DAGs you've built, blog posts about Airflow, or contributions to open-source data projects.

[Company] is an equal opportunity employer. We evaluate candidates based on skills and potential, not pedigree.

JD Tips

Include specific metrics (DAG count, failure rates, SLAs) to show production reality
Be honest about on-call expectations-data engineers expect it
Accept Dagster/Prefect experience-orchestration concepts transfer
Describe the take-home exercise clearly-debugging a DAG is realistic and respectful
Mention your managed Airflow platform-it affects the day-to-day work significantly

Complete your hiring toolkit

Interview Questions

Get matching interview questions

Skills Checklist

Evaluate candidate skills

Explore more tools

Full toolkit

Start hiring

Your next hire is already on daily.dev.

Start with one role. See what happens.

Get started → Book a demo