We're seeking an experienced Lead DevOps Engineer to spearhead our critical infrastructure transformation as PAVE.ai scales to enterprise level. This role will lead the strategic migration from Google Cloud Platform to AWS while building and managing a high-performing DevOps team. As Lead DevOps Engineer at PAVE.ai, you'll architect enterprise-grade infrastructure, establish site reliability engineering practices, and ensure 99.9%+ uptime for our vehicle inspection platform serving global automotive enterprises. This is a pivotal role that will define our infrastructure strategy and operational excellence as we process millions of vehicle inspections for dealerships, fleet operators, insurers, and vehicle marketplaces worldwide.Cloud Migration LeadershipLead and execute the complete migration strategy from GCP to AWS, ensuring zero downtimeDesign and implement AWS enterprise architecture following Well-Architected Framework principlesCreate detailed migration roadmaps with clear milestones, risk assessments, and rollback plansArchitect hybrid cloud solutions during transition phase to maintain business continuityOptimize costs during and after migration while improving performance and reliabilityDocument migration processes and create runbooks for knowledge transferTeam Leadership & DevelopmentBuild and lead a world-class DevOps team, including hiring, mentoring, and performance managementDefine team structure, roles, and responsibilities for 24/7 operational coverageEstablish DevOps culture and best practices across the engineering organizationCreate career development paths and training programs for team membersFoster collaboration between DevOps, development, and security teamsLead incident response and post-mortem processes to drive continuous improvementSite Reliability Engineering (SRE)Establish and maintain SLIs, SLOs, and SLAs for all critical servicesDesign and implement comprehensive monitoring and observability strategiesBuild automated incident detection and response systemsEnsure 99.9%+ uptime for production systems through proactive reliability engineeringImplement chaos engineering practices to identify and fix potential failuresCreate capacity planning models to support 10x growthInfrastructure & AutomationDesign scalable, secure, and cost-effective AWS infrastructure for enterprise workloadsImplement Infrastructure as Code (IaC) using Terraform/CloudFormationBuild CI/CD pipelines supporting multiple deployment strategies (blue-green, canary)Automate security compliance and governance using AWS native toolsImplement auto-scaling and self-healing infrastructureDesign disaster recovery and business continuity strategiesDevelop and enhance logging systems and observability tools (ongoing improvement initiative)Enterprise Platform DevelopmentArchitect multi-tenant infrastructure supporting enterprise isolation requirementsImplement enterprise-grade security including VPN, SSO, and zero-trust networkingDesign data residency and compliance solutions for global operationsBuild platform services for logging, monitoring, secrets management, and service meshCreate developer self-service platforms to accelerate deliveryEstablish FinOps practices for cloud cost optimizationStrategic PlanningDevelop long-term infrastructure roadmap aligned with business objectivesPartner with leadership to define technology strategy and investmentsEvaluate and introduce new technologies to improve operational efficiencyCreate business cases for infrastructure investments with ROI analysisEstablish vendor relationships and manage AWS enterprise supportDrive infrastructure standardization and consolidation initiativesSuccess MetricsComplete GCP to AWS migration within 6 months with zero critical incidentsAchieve and maintain 99.9% uptime across all production servicesReduce infrastructure costs by 30% while improving performanceBuild and retain a high-performing DevOps team with <10% attritionDecrease deployment frequency from weekly to multiple times dailyReduce MTTR (Mean Time To Recovery) by 50%