Position Overview
We are looking for someone armed with a strong tool-kit to develop and maintain technical solutions that adhere to engineering and architectural design principles while meeting business requirements. You'll be a technology owner providing technical expertise with a focus on efficiency, reliability, scalability, and security which includes planning, evaluating, designing, operationalizing, and supporting solutions in compliance with enterprise and industry standards. The ideal candidate is willing, and able to research, maintain, configure, administer, and provision infrastructure, applications, and services across our platforms.
- Understand architecture diagrams and provide input into application design
- Perform systems administration: monitor, configure, back-up, authenticate, tune, maintain, install, script, monitor applications, services, and systems.
- Script installs and stand up infrastructure in both private/public Cloud
- Identify issues, develop, and maintain processes that address and resolve them, (and be sure to communicate/alert stakeholders as needed).
- Design, implement and maintain an automated build and install/deploy process; develop and maintain build scripts of projects and/or products.
- Perform Release Engineering functions for either cloud or non-cloud services, products and platforms
- Ensure effective change management (using ServiceNow).
- Give specialized support (like research, installation, configuration, L3 support) and meets or exceeds established standards/service levels, while minimizing operational risk.
- Design, review, integrate infrastructure and application requirements (non-functional, security, integration, performance, quality, operations etc.).
- Build/deploy base infrastructure components (e.g. Azure capabilities including Virtual Machines, ASE, AKS, Blob storage, geo-replication, etc.) and application services for all environments. Help evolve the base infrastructure and operational environment, deploy new technologies in Azure and other cloud providers.
- Maintain base infrastructure components, work with vendors (Azure) to report problems, and receive fixes.
- Create and document disaster and business recovery plans and procedures.
Requirements
Looking for an individual with a strong engineering mindset, sense of ownership - strong organizational, follow-up and priority-setting skills to handle highly-complex and multi-faceted assignments and to work independently
- Undergraduate Degree or Technical Certificate
- 5-10 years relevant experience
- Appetite for contributing within a complex and critical environment
- Expert knowledge of specific domain or range of engineering frameworks, development, technology, tools, processes, standards and procedures, as well as organizational issues. Experience as a primary subject matter expert in multiple areas and a consultant on all aspects of technology and solutions
- Experience deploying, managing and operating complex applications in a Cloud environment e.g. Azure
- Understanding of shell script, powershell, Python and the ability to code for automation
- Understanding of critical concepts in DevOps (CI, CD, CM, IaC etc) and Agile principles
- Readiness and motivation (as senior or lead developer and valued subject matter expert) to address and resolve highly complex and multifaceted development-related issues, often independently.
- Excellent troubleshooting skills
- Experience in infrastructure, services and application monitoring and logging
- Configuring and managing big data technologies / databases and understanding of various approaches to data storage and indexing is an asset
Must-have:
- Linux OS administration experience
- Hadoop / Cassandra administration
- Cloud experience e.g. Azure Services including IaaS, AKS, ADLS, ADF, AWS, GCP
- Configuration management tools e.g. SALT, terraform, Ansible or CHEF
- Development/Engineering experience e.g. Bash, Shell, Python
- Excellent problem-solving skills, engineering mindset (must be able to demonstrate this in interviews)
- Jenkins, Github, Bitbucket, Nexus or similar toolsets
Nice to have:
- Windows Experience
- High-Performance Computing clusters (e.g. HPC Pack)
- Splunk, DataDog, ITRS, Sensu, Foglight, ELK
- AutoSys, ServiceNow, JIRA, Confluence
- Databricks, Blob, Event Hub and more