# Kubernetes Engineer
Location: San Francisco, CA (Hybrid) · Employment Type: Full-time · Level: Senior
[Company] is a cloud infrastructure company powering the next generation of SaaS applications. Our platform runs over 2,000 production workloads across multiple Kubernetes clusters, serving Fortune 500 enterprises and high-growth startups who depend on our 99.99% uptime guarantee.
We manage container orchestration for 150+ customers, processing millions of container deployments monthly. Our engineering team has deep expertise in distributed systems, and we're building the infrastructure layer that lets other companies scale without managing their own Kubernetes complexity.
Why join [Company]?
- Build and operate production Kubernetes clusters at significant scale
- Join a 40-person infrastructure team with deep container orchestration expertise
- Series C funded ($80M from Andreessen Horowitz and GV)
- Remote-friendly with offices in SF, NYC, and Austin
We're looking for a Kubernetes Engineer to join our Platform Engineering team. You'll own the full lifecycle of our Kubernetes infrastructure—from cluster provisioning and upgrades to building the internal developer platform that makes deployments seamless for our engineering teams.
The ideal candidate has hands-on experience managing production Kubernetes clusters, understands the internals (networking, storage, security), and enjoys building automation that makes infrastructure self-service. You'll work closely with SREs, application teams, and security engineers to ensure our clusters are reliable, secure, and cost-efficient.
- Maintain 99.99% availability across production Kubernetes clusters
- Reduce developer friction with self-service deployment tooling
- Implement GitOps workflows for infrastructure and application deployments
- Build observability and alerting for cluster health and application performance
- Drive cost optimization across our container infrastructure
- Manage and upgrade production EKS clusters across multiple AWS regions
- Build and maintain Kubernetes operators for custom resource management
- Implement GitOps workflows using ArgoCD for application and infrastructure deployments
- Design and enforce RBAC policies, Pod Security Standards, and network policies
- Build Helm charts and Kustomize overlays for standardized application deployments
- Configure and tune cluster autoscaling (Karpenter, Cluster Autoscaler)
- Set up monitoring, alerting, and dashboards using Prometheus and Grafana
- Debug complex production issues across networking, storage, and workload layers
- Participate in on-call rotation (1 week every 4 weeks) with follow-the-sun coverage
- Document runbooks, architecture decisions, and operational procedures
- Mentor engineers on Kubernetes best practices and troubleshooting
- 4+ years of experience managing production Kubernetes clusters
- Deep understanding of Kubernetes internals: control plane, etcd, kubelet, kube-proxy
- Expert-level knowledge of Kubernetes networking (CNI, Services, Ingress, Network Policies)
- Strong experience with Helm, Kustomize, and manifest management
- Hands-on experience with GitOps tools (ArgoCD or Flux)
- Proficiency in at least one programming language (Go, Python, or Bash) for automation
- Experience with Infrastructure as Code (Terraform, Pulumi, or CloudFormation)
- Solid understanding of container runtimes (containerd) and image management
- Experience with monitoring and observability (Prometheus, Grafana, Datadog)
- Comfortable with on-call responsibilities and incident response
- Experience building or maintaining Kubernetes operators
- CKA (Certified Kubernetes Administrator) or CKAD certification
- Hands-on experience with service mesh (Istio, Linkerd, or Cilium)
- Background in multi-cluster management and federation
- Experience with Kubernetes security tools (Falco, OPA/Gatekeeper, Trivy)
- Familiarity with eBPF-based networking and observability
- Experience migrating workloads to Kubernetes from legacy infrastructure
- Contributions to CNCF projects or Kubernetes ecosystem tools
- Kubernetes: EKS (primary), self-managed clusters for specific workloads
- Container Runtime: containerd
- GitOps: ArgoCD, Kustomize, Helm
- Networking: Cilium CNI, AWS Load Balancer Controller, external-dns
- Storage: EBS CSI driver, EFS for shared storage
- Observability: Prometheus, Grafana, Loki, Jaeger, Datadog
- Security: OPA Gatekeeper, Falco, Trivy, AWS IAM Roles for Service Accounts
- IaC: Terraform, Terragrunt
- CI/CD: GitHub Actions, ArgoCD ApplicationSets
- Cloud: AWS (EKS, EC2, VPC, IAM, S3, ECR)
Salary: $160,000 - $200,000 (based on experience and location)
Equity: 0.05% - 0.15% (4-year vest, 1-year cliff)
Benefits:
- Medical, dental, and vision insurance (100% covered for employees, 75% for dependents)
- Unlimited PTO with 15-day minimum encouraged
- $3,000 annual learning and development budget (conferences, certifications, courses)
- $2,000 home office setup allowance
- 401(k) with 4% company match
- 16 weeks paid parental leave
- On-call compensation: $500/week when primary on-call
- Flexible hybrid work (2 days in SF office, 3 remote) or fully remote for US-based candidates
Our interview process typically takes 2-3 weeks. We provide feedback at every stage.
- Step 1: Recruiter Screen (30 min) - We'll discuss your background, Kubernetes experience, and answer your questions about the role.
- Step 2: Technical Screen (60 min) - A conversation about Kubernetes architecture, production experience, and troubleshooting approaches.
- Step 3: System Design (60 min) - Design a multi-tenant Kubernetes platform or similar infrastructure challenge with our engineers.
- Step 4: Hands-On Session (90 min) - Debug a broken deployment scenario and implement a small operator or Helm chart. Real-world Kubernetes work, not algorithm puzzles.
- Step 5: Team Interviews (2 x 30 min) - Meet potential teammates and discuss collaboration, on-call, and team culture.
- Step 6: Hiring Manager (30 min) - Discuss career goals and growth opportunities.
We compensate $200 for take-home exercises exceeding 3 hours.
Submit your resume and optionally include links to your GitHub, any Kubernetes-related blog posts, or open-source contributions. If you've contributed to CNCF projects or built operators, we'd love to see them. We review every application and respond within 5 business days.
---
*[Company] is an equal opportunity employer. We celebrate diversity and are committed to creating an inclusive environment for all employees. We encourage applications from candidates who may not meet 100% of the qualifications—research shows underrepresented groups are less likely to apply unless they meet every requirement.*
# Kubernetes Engineer
**Location:** San Francisco, CA (Hybrid) · **Employment Type:** Full-time · **Level:** Senior
## About [Company]
[Company] is a cloud infrastructure company powering the next generation of SaaS applications. Our platform runs over 2,000 production workloads across multiple Kubernetes clusters, serving Fortune 500 enterprises and high-growth startups who depend on our 99.99% uptime guarantee.
We manage container orchestration for 150+ customers, processing millions of container deployments monthly. Our engineering team has deep expertise in distributed systems, and we're building the infrastructure layer that lets other companies scale without managing their own Kubernetes complexity.
**Why join [Company]?**
- Build and operate production Kubernetes clusters at significant scale
- Join a 40-person infrastructure team with deep container orchestration expertise
- Series C funded ($80M from Andreessen Horowitz and GV)
- Remote-friendly with offices in SF, NYC, and Austin
## The Role
We're looking for a Kubernetes Engineer to join our Platform Engineering team. You'll own the full lifecycle of our Kubernetes infrastructure—from cluster provisioning and upgrades to building the internal developer platform that makes deployments seamless for our engineering teams.
The ideal candidate has hands-on experience managing production Kubernetes clusters, understands the internals (networking, storage, security), and enjoys building automation that makes infrastructure self-service. You'll work closely with SREs, application teams, and security engineers to ensure our clusters are reliable, secure, and cost-efficient.
## Objectives of This Role
- Maintain 99.99% availability across production Kubernetes clusters
- Reduce developer friction with self-service deployment tooling
- Implement GitOps workflows for infrastructure and application deployments
- Build observability and alerting for cluster health and application performance
- Drive cost optimization across our container infrastructure
## Responsibilities
- Manage and upgrade production EKS clusters across multiple AWS regions
- Build and maintain Kubernetes operators for custom resource management
- Implement GitOps workflows using ArgoCD for application and infrastructure deployments
- Design and enforce RBAC policies, Pod Security Standards, and network policies
- Build Helm charts and Kustomize overlays for standardized application deployments
- Configure and tune cluster autoscaling (Karpenter, Cluster Autoscaler)
- Set up monitoring, alerting, and dashboards using Prometheus and Grafana
- Debug complex production issues across networking, storage, and workload layers
- Participate in on-call rotation (1 week every 4 weeks) with follow-the-sun coverage
- Document runbooks, architecture decisions, and operational procedures
- Mentor engineers on Kubernetes best practices and troubleshooting
## Required Skills and Qualifications
- 4+ years of experience managing production Kubernetes clusters
- Deep understanding of Kubernetes internals: control plane, etcd, kubelet, kube-proxy
- Expert-level knowledge of Kubernetes networking (CNI, Services, Ingress, Network Policies)
- Strong experience with Helm, Kustomize, and manifest management
- Hands-on experience with GitOps tools (ArgoCD or Flux)
- Proficiency in at least one programming language (Go, Python, or Bash) for automation
- Experience with Infrastructure as Code (Terraform, Pulumi, or CloudFormation)
- Solid understanding of container runtimes (containerd) and image management
- Experience with monitoring and observability (Prometheus, Grafana, Datadog)
- Comfortable with on-call responsibilities and incident response
## Preferred Skills and Qualifications
- Experience building or maintaining Kubernetes operators
- CKA (Certified Kubernetes Administrator) or CKAD certification
- Hands-on experience with service mesh (Istio, Linkerd, or Cilium)
- Background in multi-cluster management and federation
- Experience with Kubernetes security tools (Falco, OPA/Gatekeeper, Trivy)
- Familiarity with eBPF-based networking and observability
- Experience migrating workloads to Kubernetes from legacy infrastructure
- Contributions to CNCF projects or Kubernetes ecosystem tools
## Tech Stack
- **Kubernetes:** EKS (primary), self-managed clusters for specific workloads
- **Container Runtime:** containerd
- **GitOps:** ArgoCD, Kustomize, Helm
- **Networking:** Cilium CNI, AWS Load Balancer Controller, external-dns
- **Storage:** EBS CSI driver, EFS for shared storage
- **Observability:** Prometheus, Grafana, Loki, Jaeger, Datadog
- **Security:** OPA Gatekeeper, Falco, Trivy, AWS IAM Roles for Service Accounts
- **IaC:** Terraform, Terragrunt
- **CI/CD:** GitHub Actions, ArgoCD ApplicationSets
- **Cloud:** AWS (EKS, EC2, VPC, IAM, S3, ECR)
## Compensation and Benefits
**Salary:** $160,000 - $200,000 (based on experience and location)
**Equity:** 0.05% - 0.15% (4-year vest, 1-year cliff)
**Benefits:**
- Medical, dental, and vision insurance (100% covered for employees, 75% for dependents)
- Unlimited PTO with 15-day minimum encouraged
- $3,000 annual learning and development budget (conferences, certifications, courses)
- $2,000 home office setup allowance
- 401(k) with 4% company match
- 16 weeks paid parental leave
- On-call compensation: $500/week when primary on-call
- Flexible hybrid work (2 days in SF office, 3 remote) or fully remote for US-based candidates
## Interview Process
Our interview process typically takes 2-3 weeks. We provide feedback at every stage.
- **Step 1: Recruiter Screen** (30 min) - We'll discuss your background, Kubernetes experience, and answer your questions about the role.
- **Step 2: Technical Screen** (60 min) - A conversation about Kubernetes architecture, production experience, and troubleshooting approaches.
- **Step 3: System Design** (60 min) - Design a multi-tenant Kubernetes platform or similar infrastructure challenge with our engineers.
- **Step 4: Hands-On Session** (90 min) - Debug a broken deployment scenario and implement a small operator or Helm chart. Real-world Kubernetes work, not algorithm puzzles.
- **Step 5: Team Interviews** (2 x 30 min) - Meet potential teammates and discuss collaboration, on-call, and team culture.
- **Step 6: Hiring Manager** (30 min) - Discuss career goals and growth opportunities.
We compensate $200 for take-home exercises exceeding 3 hours.
## How to Apply
Submit your resume and optionally include links to your GitHub, any Kubernetes-related blog posts, or open-source contributions. If you've contributed to CNCF projects or built operators, we'd love to see them. We review every application and respond within 5 business days.
---
*[Company] is an equal opportunity employer. We celebrate diversity and are committed to creating an inclusive environment for all employees. We encourage applications from candidates who may not meet 100% of the qualifications—research shows underrepresented groups are less likely to apply unless they meet every requirement.*