We are seeking a highly skilled and experienced AWS Monitoring & Operation Specialist to oversee system operations, monitor infrastructure health, and ensure optimal performance and security. This role involves working within a customer-provided environment to deliver high-quality support and continuous improvement.Key ResponsibilitiesSystem Monitoring & Alert ManagementContinuously monitor system health using tools such as Amazon CloudWatch, Prometheus, Zabbix, etc.Respond promptly to alerts and incident notifications.Investigate root causes and execute recovery actions.Recommend and implement preventive measures to avoid recurrence.Performance OptimizationAnalyze system performance and resource utilization.Propose and implement tuning strategies to enhance performance and optimize costs.Backup & RestorationPerform and verify scheduled system backups.Ensure reliable and timely recovery procedures in the event of system failures or incidents.Account & Security ManagementManage user accounts, roles, and permissions.Conduct access reviews, vulnerability assessments, and security audits.Define, implement, and enforce security policies and standards.System MaintenanceApply operating system and middleware updates as required.Deploy security patches and perform batch update activities in a timely manner.Technical Support & ConsultationProvide technical guidance and troubleshooting support to internal teams.Contribute to the development and maintenance of operational documentation and best practices. Required Skills & Qualifications:Technical Skills:Experience with AWS services (EC2, S3, RDS, IAM, CloudWatch, Lambda, etc.).Experience with AWS CLI, SDKs, and Infrastructure as Code (e.g., CloudFormation, Terraform).Experience with monitoring tools (CloudWatch, Prometheus, Zabbix, etc.).Experience with Linux/Unix/Windows system administration skills.Experience with Scripting and automation experience (e.g., Bash, Python, Ansible).Experience with Knowledge of AWS security best practices, access control, and vulnerability scanning.Experience with backup strategies and disaster recovery planning.Soft SkillsStrong analytical and problem-solving skills.Effective communication and documentation abilities in English.Japanese communication and documentation skills are a plus.Capable of working both independently and as part of a team.Proactive and customer-focused mindset with a commitment to delivering quality support.