Our client is seeking a DevOps Engineer that brings hands-on experience with observability tooling, system performance monitoring, and security automation.
This role will play a key part in maintaining system reliability, performance, and security across our cloud infrastructure, leveraging principles from the Google SRE Handbook.
Duration: 3 months + extension option
Location: remote
Responsibilities
- Develop and maintain custom Logstash modules for log ingestion, parsing, and enrichment tailored to enterprise needs.
- Configure and manage Azure Monitor and related telemetry tools for real-time system and application performance monitoring.
- Implement uptime monitoring, alerting, and incident response pipelines using industry best practices.
- Apply Google SRE principles to improve system availability, reliability, and scalability.
- Collaborate with development teams to integrate DevSecOps practices into CI/CD pipelines, focusing on security as code.
- Automate operational tasks and system diagnostics to reduce manual effort and improve response times.
- Conduct periodic reviews of infrastructure and application security posture and implement proactive remediation steps.
- Support disaster recovery, rollback planning, and real-time issue resolution during critical outages.
Requirements
- 5+ years of experience in DevOps or DevSecOps roles.
- Proficiency with Logstash, Elastic Stack, and custom plugin/module development.
- Experience with Azure Monitor, Application Insights, and Log Analytics.
- Familiarity with Google SRE practices and implementing SLIs, SLOs, and error budgets.
- Strong scripting skills (e.g., PowerShell, Bash, Python).
- Solid understanding of CI/CD pipelines, infrastructure-as-code (e.g., Terraform or Bicep), and containerization (Docker, Kubernetes).
- Experience integrating security controls into DevOps workflows.
- Excellent troubleshooting, communication, and collaboration skills.