Tech Insights
Montreal, QC
Closed
Job Details:
Drive cloud operations excellence as a Site Reliability Engineer in a remote environment. Your role focuses on AWS infrastructure design, automation, and ensuring high availability across platforms. We seek a skilled engineer with experience in cloud operations to monitor and optimize systems performance while collaborating with development teams on service goals. You will also lead incident responses and develop CI/CD pipelines, enhancing operational efficiency and reliability in a multi-region setup. Key Responsibilities: • Implement high-availability infrastructures across AWS • Monitor resource utilization using DataDog and CloudWatch • Develop infrastructure-as-code solutions with Terraform • Manage containerization platforms using Docker and Kubernetes • Conduct thorough post-incident reviews for continuous improvement Requirements: • 5-7 years of relevant experience in SRE or DevOps • Deep understanding of AWS services and hybrid environments • Proficiency in Python, Go, or Java • Experience in CI/CD pipeline development • Solid knowledge of Linux/Unix systems Bring your skills to enhance system reliability and optimize cloud operations effectively in this dynamic remote role.