Mô Tả Công Việc
1. Site Reliability Engineering & Infrastructure Management (35%) • Define SRE roadmap: Build and implement SRE roadmap to create cross-functional systems that meet the company's scalable requirements • Multi-cloud deployment: Deploy and manage services across On- premise, Azure, and AWS environments • Infrastructure as Code: Implement deployment and configuration automation using tools like Terraform, Ansible, or CloudFormation • System monitoring: Set up and maintain monitoring systems (Prometheus, Grafana) to identify potential issues • Performance optimization: Optimize performance of services including Elasticsearch, Logstash, and overall system performance • Ensure system and service continuity, implement disaster recovery strategies2. DevOps & CI/CD Pipeline (25%) • Application modernization: Standardize and automate application development & deployment pipelines • CI/CD implementation: Design and maintain CI/CD pipelines using GitLab CI, Jenkins, or Azure DevOps • Configuration management: Manage and control deployment flows, standardize configurations for tracking • Container orchestration: Deploy and manage containerized applications using Docker and Kubernetes • GitOps practices: Implement GitOps workflows with ArgoCD or FluxCD • Secret management: Implement secure secret management solutions3. MLOps & Machine Learning Infrastructure (25%) • ML pipeline automation: Design and implement automated ML pipelines for SmartCity AI models • Model deployment & serving: Setup model training, versioning, and serving infrastructure • Experiment tracking: Implement experiment tracking and model registry systems • Data pipeline management: Manage data pipelines and data lakes for ML workloads • Feature stores: Set up and maintain feature stores for ML model consistency • ML monitoring: Monitor ML model performance, drift detection, and retraining automation • A/B testing infrastructure: Design infrastructure for A/B testing ML models in production4. SmartCity Specific Infrastructure (15%) • IoT infrastructure: Manage infrastructure for IoT device integration and data streaming • Edge computing: Deploy and manage edge computing solutions for real-time processing • Data streaming: Implement real-time data streaming with Kafka or similar tools for SmartCity applications • Security compliance: Ensure compliance with security standards for government/municipal systems • High availability: Design fault-tolerant systems for critical SmartCity services
Xem toàn bộ Mô Tả Công Việc
Yêu Cầu Công Việc
Required Qualifications: • Bachelor's degree in Computer Science, Software Engineering, Information Technology, or equivalent • 2+ years of experience in DevOps/SRE with Linux experience and Site Reliability Engineering responsibilities • Good understanding of Docker and Kubernetes container orchestration • Experience with monitoring systems: Grafana, Prometheus • Hands-on experience with CI/CD tools: GitLab CI, Jenkins, or Azure DevOps • Experience with centralized logging solutions: ELK stack or similar • Cloud experience: AWS or Azure - deployment, management, optimization • Programming skills: Bash, Python for automation and scripting
Xem toàn bộ Yêu Cầu Công Việc
Hình thức
Full-time
Mức lương
Thỏa thuận
Báo cáo tin tuyển dụng: Nếu bạn thấy rằng tin tuyển dụng này không đúng hoặc có dấu hiệu lừa đảo,
hãy phản ánh với chúng tôi.
Tham khảo: 10 Dấu hiệu nhận biết hành vi lừa đảo qua tin tuyển dụng.
Tham khảo: 10 Dấu hiệu nhận biết hành vi lừa đảo qua tin tuyển dụng.