Primary Responsibilities:
• Provide L3 support for clients private cloud, including on-call rotation
• Work closely with the internal engineering team and provide input on testing of new component releases and infrastructure upgrades, as well as performance, capacity, and monitoring
• Create and improve processes for support, including training, documentation, customer engagement, automation, and scripting, incident, problem, and change management
• Work together with L2 teams and other L3 team members internationally
Required Skills:
• 5 to 7 years of relevant experience
• 3 to 5 years of Linux experience.
• Sound knowledge of server infrastructure, virtualization, cloud computing
• Proven Kubernetes and Docker experience
• Excellent understanding of internet and networking protocols, including TCP/IP, HTTP/HTTPS
• Strong understanding of security protocols, e.g. SSL/TLS, Kerberos
• Strong organizational skills and ability to manage multiple tasks and high-pressure situations for outage resolution
• Experience with Agile and DevOps/SRE concepts
• Have administrative competence in at least one major scripting language or platform (for example Python)
• Communicate effectively with various user groups, e.g. developers and engineers, as well as remote team members
Nice to have:
• Knowledge of system monitoring in cloud environments, including cloud - specific products and tools
• Experience in developing monitoring architecture and implementing monitoring agents, dashboards, and alerts
• Experience operating in large, enterprise environments
• Experience with maintaining high-availability production systems
• Experience in enterprise-level hosting environments, in particular cloud and container technologies