Job Purpose:The DevOps Team Lead sits at the intersection of technical expertise, operational reliability, and project delivery. This role is responsible for leading a team of Systems/Platform engineers to design, implement, and maintain secure, scalable, and highly available infrastructure across AWS, Azure, Google Cloud, and on‑premise environments. The position owns the end‑to‑end application delivery platform (CI/CD, Kubernetes, GitLab, ArgoCD, Helm), observability stack, and continuous ISO/IEC 27001 compliance within the team, ensuring timely delivery of high‑quality infrastructure services that support business objectives.Key ResponsibilitiesInfrastructure & IaC ManagementLead the design, implementation, and maintenance of infrastructure across AWS, Azure, Google Cloud, and on‑premise servers.Champion Infrastructure as Code (IaC) practices using tools such as Terraform, Terragrunt, CloudFormation, or equivalent to provision, configure, and manage infrastructure in a repeatable and auditable way.Ensure environments are standardized, secure, cost‑optimized, and aligned with architecture and security guidelines.Application Delivery & Platform EngineeringOwn and evolve the application delivery platform using GitLab CI, ArgoCD, Helm charts, and Kubernetes.Design and maintain CI/CD pipelines to support reliable, frequent, and automated application deployments across environments.Establish best practices and guardrails for Kubernetes cluster configuration, namespace management, Helm chart management, and deployment strategies (e.g., blue/green, canary).Collaborate closely with development teams to ensure smooth, predictable, and observable releases.Monitoring, Logging & AlertingLead the design, implementation, and continuous improvement of the observability stack, including Prometheus, Thanos, Alertmanager, Grafana, Kibana, and Elasticsearch.Define and maintain monitoring standards, SLOs/SLIs, dashboards, and alerting rules to ensure early detection and rapid resolution of incidents.Ensure logs, metrics, and traces are consistently collected, stored, and accessible for troubleshooting, performance tuning, and capacity planning.Compliance & Information Security (ISO/IEC 27001)Lead the implementation, documentation, and continuous maintenance of the ISO/IEC 27001 Information Security Management System (ISMS) within the team.Ensure infrastructure, platforms, and operational processes adhere to information security policies, controls, and audit requirements.Collaborate with Information Security, Risk, and Compliance stakeholders to support audits, risk assessments, and corrective actions.Promote a culture of security and compliance awareness within the team and across collaborating functions.Team Leadership & People ManagementLead, mentor, and develop a team of Systems/Platform engineers; provide regular feedback, support career growth, and foster a high‑performance culture.Plan and prioritize team workload, ensuring timely delivery of projects, BAU tasks, and incident resolution.Promote knowledge sharing, documentation, and cross‑training to reduce single points of failure.CollaborationWork closely with software development, security, network, and service desk teams to ensure infrastructure and platforms meet business and operational requirements.Translate business needs into technical solutions, set expectations, and communicate clearly on progress, risks, and timelines.Participate in architecture and design discussions, contributing infrastructure and operations perspectives.Reliability, Incident & Problem ManagementOversee incident response, including triage, communication, and coordination with relevant teams to minimize downtime and impact.Drive root cause analysis (RCA) and implement corrective and preventive actions for recurring issues.Continuously improve operational processes, runbooks, and standard operating procedures.