Mô Tả Công Việc
Core Responsibilities
- Own monitoring of the system and create alerting on failures and be responsible for those failures when they happen.
- Provide a clear set of guidelines for implementation which result in reliable systems and high change velocity.
- Work to eliminate toil and other types of repetitive manual work which provide very little value, and achieve the applications agreed to service levels.
- Acknowledge that failures happen and establish an incident management process which make efficient use of the team's time.
- Work with agile development methodologies, adhering to best practices and pursuing continued learning opportunities including the latest changes in technology and new SRE practices.
- Conduct postmortems to identify to the root cause of the issue and put procedures or software tools in place to prevent it recurring.
- Take care of administration tasks but automate them whenever possible.
Functional Responsibilities
- Lead team of experienced Site Reliability Engineers – Linux, in both Vietnam and US
- Design, deploy/install, configure, automate, and maintain systems infrastructure and applications.
- Support a 24×7 critical application that serves millions of customers daily.
- Be the go-to reference for UNIX-like OS, Docker, Kubernetes & handling escalation internal/external and on-call.
- Maintain systems and troubleshoot system issues by working closely with the dev and dev ops team to manage monitoring of the system and alerting on failures.
- Identify bottlenecks in various Linux applications and implement performance improvements.
- Prioritize, assign, and execute tasks throughout the software development life cycle.
- Develop, configure, and deploy tools for cloud-based systems and services.
- Containerize new and legacy applications to help limit downtime and increase scalability and reliability.
- Support development and operations teams as requested and enhance, modify, or debug developer code as needed.
Yêu Cầu Công Việc
Core Responsibilities
- Own monitoring of the system and create alerting on failures and be responsible for those failures when they happen.
- Provide a clear set of guidelines for implementation which result in reliable systems and high change velocity.
- Work to eliminate toil and other types of repetitive manual work which provide very little value, and achieve the applications agreed to service levels.
- Acknowledge that failures happen and establish an incident management process which make efficient use of the team's time.
- Work with agile development methodologies, adhering to best practices and pursuing continued learning opportunities including the latest changes in technology and new SRE practices.
- Conduct postmortems to identify to the root cause of the issue and put procedures or software tools in place to prevent it recurring.
- Take care of administration tasks but automate them whenever possible.
Functional Responsibilities
- Lead team of experienced Site Reliability Engineers – Linux, in both Vietnam and US
- Design, deploy/install, configure, automate, and maintain systems infrastructure and applications.
- Support a 24×7 critical application that serves millions of customers daily.
- Be the go-to reference for UNIX-like OS, Docker, Kubernetes & handling escalation internal/external and on-call.
- Maintain systems and troubleshoot system issues by working closely with the dev and dev ops team to manage monitoring of the system and alerting on failures.
- Identify bottlenecks in various Linux applications and implement performance improvements.
- Prioritize, assign, and execute tasks throughout the software development life cycle.
- Develop, configure, and deploy tools for cloud-based systems and services.
- Containerize new and legacy applications to help limit downtime and increase scalability and reliability.
- Support development and operations teams as requested and enhance, modify, or debug developer code as needed.
Hình thức
Mức lương
Thỏa thuận
Tham khảo: 10 Dấu hiệu nhận biết hành vi lừa đảo qua tin tuyển dụng.