What Cloud Engineers Actually Do
A Day in the Life
Cloud Engineering spans architecture, operations, security, and cost optimization. The role emerged as companies moved infrastructure from data centers to cloud providers and discovered that cloud expertise requires specialized skills beyond traditional operations.
Cloud Architecture and Design
Cloud Engineers design the foundation for applications and services:
- Solution architecture — Designing cloud infrastructure that meets application requirements
- Service selection — Choosing between managed services (RDS, DynamoDB) vs. self-hosted (EC2 + MySQL)
- Multi-region and multi-AZ — Designing for high availability and disaster recovery
- Serverless architecture — Lambda, Cloud Functions, Azure Functions patterns
- Container infrastructure — EKS, GKE, AKS for Kubernetes workloads
- Data architecture — S3/GCS for object storage, data lakes, analytics pipelines
- Networking design — VPCs, subnets, peering, Transit Gateway, Direct Connect
Cloud Migration
Moving workloads to cloud is a core Cloud Engineer responsibility:
- Assessment and planning — Analyzing workloads, dependencies, and migration strategies
- Lift-and-shift — Moving applications to cloud with minimal changes (rehosting)
- Replatforming — Adapting applications to use some cloud-native services
- Refactoring — Rebuilding applications for cloud-native (serverless, containers)
- Data migration — Moving databases, files, and data warehouses to cloud
- Cutover planning — Minimizing downtime during migration with rollback strategies
Infrastructure as Code (IaC)
Modern Cloud Engineers define infrastructure in code:
- Terraform — The industry standard for multi-cloud IaC
- CloudFormation/CDK — AWS-native infrastructure definition
- Pulumi — Infrastructure as code using general-purpose languages
- Module development — Creating reusable, parameterized infrastructure components
- GitOps workflows — Version-controlled infrastructure with PR-based changes
- Drift detection — Identifying and reconciling configuration drift
Cost Optimization (FinOps)
Cloud costs can spiral without active management:
- Right-sizing — Matching instance types to actual workload requirements
- Reserved Instances and Savings Plans — Committing to capacity for 30-70% discounts
- Spot/Preemptible instances — Using interruptible capacity for fault-tolerant workloads
- Resource tagging — Tracking costs by team, project, environment, and application
- Cost monitoring and alerting — Detecting anomalies before bills arrive
- Architecture optimization — Redesigning for cost efficiency (serverless, auto-scaling)
- Unused resource cleanup — Identifying orphaned EBS volumes, old snapshots, idle instances
Security and Compliance
Cloud security requires specialized knowledge:
- IAM and identity — RBAC, least privilege, identity federation, service accounts
- Network security — VPCs, security groups, NACLs, WAF, DDoS protection
- Encryption — At-rest and in-transit encryption, KMS key management
- Secrets management — AWS Secrets Manager, Azure Key Vault, GCP Secret Manager
- Compliance frameworks — SOC 2, HIPAA, PCI-DSS, GDPR in cloud context
- Security monitoring — CloudTrail, Security Hub, GuardDuty, Cloud Audit Logs
- Vulnerability management — Container scanning, AMI hardening, patch management
Cloud Operations
Keeping cloud infrastructure running reliably:
- Monitoring and observability — CloudWatch, Azure Monitor, Cloud Monitoring
- Backup and DR — Snapshots, cross-region replication, recovery testing
- Performance tuning — Database optimization, caching, CDN configuration
- Incident response — Cloud-specific troubleshooting and recovery procedures
- Capacity planning — Forecasting resource needs and reservations
- Auto-scaling — Configuring horizontal and vertical scaling policies
Cloud Engineer vs DevOps vs Platform Engineer
These roles overlap but have distinct focuses:
Cloud Engineer
Focus: Cloud platform expertise, architecture, and optimization
Key metrics: Cost efficiency, cloud utilization, migration success
Typical work: Infrastructure design, cost optimization, cloud security
Platform depth: Deep expertise in 1-2 cloud providers
DevOps Engineer
Focus: CI/CD, deployment automation, bridging dev and ops
Key metrics: Deployment frequency, lead time, change failure rate
Typical work: Pipelines, automation, developer tooling
Platform depth: Broader but shallower across many tools
Platform Engineer
Focus: Internal developer platforms and self-service
Key metrics: Developer productivity, platform adoption
Typical work: Internal tooling, developer portals, abstractions
Platform depth: Building products for developers
When to Hire Which
| Your Need | Best Fit |
|---|---|
| Migrating to cloud | Cloud Engineer |
| Cloud cost reduction | Cloud Engineer |
| Multi-cloud strategy | Cloud Engineer |
| Faster deployments | DevOps Engineer |
| Internal developer platform | Platform Engineer |
| All of the above | One of each, or a senior generalist |
Skill Levels
Junior Cloud Engineer (0-2 years)
- Provisions cloud resources using Terraform/CloudFormation
- Follows established patterns and best practices
- Needs guidance on architecture decisions
- Basic understanding of core services (EC2, S3, VPC, IAM)
- Can troubleshoot common issues with guidance
- Typical salary: $100-130K
Mid-Level Cloud Engineer (2-5 years)
- Designs cloud solutions for new applications
- Optimizes costs and performance proactively
- Handles cloud incidents independently
- Understands trade-offs in service selection
- Can lead small migration projects
- Typical salary: $130-165K
Senior Cloud Engineer (5+ years)
- Architects enterprise cloud strategy
- Sets standards and best practices for the org
- Makes build vs. buy decisions for cloud services
- Mentors other engineers and reviews designs
- Leads complex migrations and transformations
- Typical salary: $165-200K
Staff/Principal Cloud Engineer (7+ years)
- Defines multi-year cloud strategy
- Influences industry best practices
- Negotiates enterprise agreements with cloud providers
- Drives organizational change around cloud adoption
- Typical salary: $200-250K+
Career Progression
Curiosity & fundamentals
Independence & ownership
Architecture & leadership
Strategy & org impact
What to Look For by Cloud Provider
AWS-Focused Teams
AWS dominates the market with the broadest service offering:
- Priority skills: EC2, S3, RDS, Lambda, VPC, IAM, CloudFormation/Terraform
- Differentiators: Cost optimization experience, Well-Architected Framework knowledge
- Interview signal: "Design a highly available application on AWS. Walk me through your choices."
- Certifications: Solutions Architect (Associate/Professional)—helpful but not required
Azure-Focused Teams
Common in enterprises with Microsoft ecosystems:
- Priority skills: Virtual Machines, Azure Functions, AKS, Azure AD, Virtual Networks
- Differentiators: Hybrid cloud experience, Active Directory integration
- Interview signal: "Design a hybrid solution connecting on-prem AD to Azure services."
- Certifications: Azure Solutions Architect—helpful for enterprise validation
GCP-Focused Teams
Growing in data-heavy and ML-focused organizations:
- Priority skills: Compute Engine, Cloud Functions, GKE, BigQuery, Cloud IAM
- Differentiators: Data platform experience, BigQuery/Dataflow expertise
- Interview signal: "Design a data pipeline on GCP that handles 1TB daily ingestion."
- Certifications: Professional Cloud Architect—valuable for data engineering roles
Multi-Cloud Teams
Increasingly common for redundancy, compliance, or vendor negotiation:
- Priority skills: Cloud-agnostic design, Terraform multi-provider, Kubernetes
- Differentiators: Experience with cloud abstraction patterns
- Interview signal: "How would you design an application that can run on AWS or GCP?"
- Tools: Terraform (multi-provider), Kubernetes (portable), Pulumi
Where to Find Cloud Engineers
Cloud Engineers congregate in specific communities and tend to be active learners given how quickly cloud services evolve.
Online Communities
- daily.dev — Cloud and infrastructure content is popular among platform engineers
- r/aws, r/googlecloud, r/azure — Active Reddit communities
- AWS/GCP/Azure Slack communities — Direct access to practitioners
- CNCF Slack — Cloud-native engineers on Kubernetes and related tech
- HashiCorp Community — Terraform practitioners
Events and Conferences
- AWS re:Invent — Largest cloud conference, excellent for networking
- Google Cloud Next — GCP practitioners and announcements
- Microsoft Ignite — Azure-focused enterprise engineers
- KubeCon — Cloud-native and Kubernetes engineers
- HashiConf — Infrastructure as Code community
Sourcing Signals
Look for engineers with:
- GitHub activity — Terraform modules, CloudFormation templates, cloud automation
- Blog posts — Architecture write-ups, migration stories, cost optimization tips
- Cloud certifications + production experience — Certifications alone aren't enough
- Open source contributions — Cloud-related projects, Terraform providers
- Conference talks — Speaking indicates depth and thought leadership
Hidden Talent Pools
- Consulting backgrounds — Big 4 cloud practices, boutique cloud consultancies
- Startups that scaled — Engineers who grew infrastructure from seed to scale
- Cloud provider alumni — AWS, GCP, Azure professional services teams
- Managed service providers — Engineers who've seen many cloud environments
Common Hiring Mistakes
1. Overweighting Certifications
Cloud certifications (AWS Solutions Architect, Azure Administrator) demonstrate theoretical knowledge but don't guarantee practical skills. Many certified engineers have never built production systems. Test for real experience: What have they architected? What cost problems have they solved? What incidents have they handled?
2. Requiring All Cloud Providers
"Must have AWS AND Azure AND GCP" is unrealistic. Most companies use one primary cloud. Strong cloud engineers learn new providers in 2-3 months because concepts transfer. Focus on depth with your provider and cloud fundamentals that transfer.
3. Ignoring Cost Optimization Skills
Cloud costs spiral without optimization. A single misconfigured service can cost $10K/month. Look for candidates who naturally think about cost: right-sizing, reserved capacity, architecture efficiency. Ask for specific examples with numbers.
4. Not Testing Architecture Skills
Cloud Engineers design solutions, not just provision resources. A system design interview is essential: "Design a scalable application on [cloud provider]." Can they select appropriate services? Explain trade-offs? Design for cost and reliability?
5. Overlooking Migration Experience
If you're moving to cloud, migration experience is gold. It's different from greenfield cloud work—dealing with legacy constraints, data migration, cutover coordination, and hybrid periods. Ask about past migrations specifically.
6. Confusing Cloud Engineer with DevOps
Cloud Engineers specialize in cloud platforms. DevOps Engineers may use cloud but focus on CI/CD and developer workflows. Hiring a DevOps engineer for deep cloud architecture work (or vice versa) leads to mismatched expectations.
Developer Expectations vs. Trust Breakers
| Aspect | What Cloud Engineers Expect | What Breaks Trust |
|---|---|---|
| Cloud Strategy | Clear direction on cloud adoption and investment | "We're on AWS but might switch to Azure" with no plan |
| Infrastructure as Code | Everything in Terraform/CDK, no ClickOps | "Just use the console for quick changes" |
| Cost Visibility | Access to billing data, authority to optimize | Surprise about costs, no optimization budget |
| Architecture Ownership | Authority to make infrastructure decisions | Mandated services without engineering input |
| Modern Services | Freedom to use managed/serverless where appropriate | "We only use EC2" regardless of use case |
| On-call | Shared rotation, fair compensation, runbooks | Cloud team handles all infrastructure incidents |
Interview Approach
Technical Assessment
- Cloud architecture design — "Design a highly available e-commerce platform on AWS"
- Cost optimization — "This infrastructure costs $50K/month. Walk me through your optimization approach."
- IaC review — Review their Terraform/CloudFormation code for best practices
- Troubleshooting — "Application latency spiked after a deployment. How would you investigate in AWS?"
Experience Deep-Dive
- Past architectures — What have they designed? At what scale? What constraints?
- Migrations — Have they migrated workloads? What were the challenges?
- Cost optimization — Specific examples of reducing cloud spend with numbers
- Incidents — Cloud-specific incidents and how they resolved them
Cloud Provider Knowledge
- Service selection — "When would you use Aurora vs. DynamoDB vs. ElastiCache?"
- Networking — VPC design, Transit Gateway, hybrid connectivity
- Security — IAM best practices, encryption strategies, compliance approach
- Limits and quotas — Understanding cloud provider constraints and workarounds
Recruiter's Cheat Sheet
Resume Green Flags
- Terraform with module design experience (not just basic usage)
- Specific cloud services with production context (not just listed)
- Cost optimization examples with dollar amounts or percentages
- Migration projects with scope and outcomes
- Architecture design work (not just resource provisioning)
- Multiple cloud providers shows adaptability
Resume Yellow Flags
- Only certifications, no production experience described
- Vague "cloud experience" without specific services
- No mention of cost optimization or efficiency
- Only one cloud provider after 5+ years (may lack breadth)
- "DevOps" title but only cloud work described (may be mismatch)
Technical Terms to Know
| Term | What It Means |
|---|---|
| IaC | Infrastructure as Code—defining cloud resources in code (Terraform, CloudFormation) |
| VPC | Virtual Private Cloud—isolated network environment in cloud |
| IAM | Identity and Access Management—who can do what in cloud |
| EC2/Compute Engine/VM | Virtual servers (AWS/GCP/Azure terminology) |
| S3/GCS/Blob Storage | Object storage for files and data |
| RDS/Cloud SQL/Azure SQL | Managed relational databases |
| Lambda/Cloud Functions/Azure Functions | Serverless compute—run code without managing servers |
| EKS/GKE/AKS | Managed Kubernetes services |
| Multi-AZ | Deploying across multiple data centers for availability |
| FinOps | Financial operations—managing cloud costs as engineering concern |
| Well-Architected | AWS framework for evaluating cloud architecture quality |
| Reserved Instances | Pre-purchasing capacity for discount (30-70% savings) |
Developer Expectations
| Aspect | ✓ What They Expect | ✗ What Breaks Trust |
|---|---|---|
| Cloud Strategy | →Clear cloud direction with investment and executive support | ⚠Unclear multi-cloud strategy, constant second-guessing of cloud decisions |
| Infrastructure as Code | →Everything in Terraform/CDK, GitOps workflows, no manual console changes | ⚠"Just do it in the console for now" or ClickOps culture |
| Cost Visibility | →Access to billing data, authority and time to optimize | ⚠No visibility into costs, optimization treated as distraction from "real work" |
| Architecture Ownership | →Authority to make cloud architecture decisions for their domain | ⚠Mandated services without engineering input, decisions made by non-technical leadership |
| Modern Tooling | →Freedom to use managed services, serverless, and cloud-native patterns | ⚠"We only use EC2 and self-managed databases" regardless of use case |