Position Name - HPC System Administrator
Type of hiring - Fulltime/Subcon
Location - Ottawa, ON (Onsite)
Bilingualism (English/French)
Reliability Clearance - Yes (If not, should be able to obtain it)
Job Description:
Years of experience: 5+ Years plus in the subject matter area
Main Responsibilities
- Identify, diagnose, and resolve level two problems for users of the software and hardware, LAN and WAN, VPN, the Internet, mobile devices, and new computer technology; communicate solutions to end-users.
- Manage day-day operations and support of the HPC environment (Linux).
- Take ownership of capacity, availability and performance of the HPC cluster(s).
- Support end users in the submission and management of jobs based on Slurm and OpenHPC.
- Migrate existing nodes as required to Linux.
- Identity Management and multifactor authentication with integration between Active Directory and Linux platforms.
Specialized Skills, Knowledge & Abilities
- In-depth and demonstrated experience in the installation and operation of Linux platforms in an Enterprise environment (Ubuntu/RedHat).
- Experience in the use of KVM or other hypervisors.
- Experience in HPC tools such as Slurm, LSF or GridEngine.
- Demonstrated knowledge of HPC clusters and use cases.
- Bilingualism (English/French) is an asset.