Senior Site Reliability Specialist (SRE)

GHGSAT - 8 emplois

Montreal, QC

Postulez dès maintenant

Posté aujourd'hui

Détails de l'emploi :

Temps plein

Expérimenté

Avantages :

Assurance dentaire

Congés payés

Modalités de travail flexibles

Options d'achat d'actions

Senior Site Reliability Specialist (SRE)

Location: Hybrid (Montreal)

Clearance Requirement: Successful candidates must be eligible for Canadian government security clearance (Controlled Goods Program) and will be expected to participate in a rotational on-call schedule.

GHGSat is on a mission to tackle climate change with hard data. We operate the world's first and only satellite constellation capable of detecting and measuring greenhouse gas emissions — down to the facility level — from space. Our customers are leading industrial operators and governments around the world. We help them see their emissions clearly and act faster.

As a Senior Site Reliability Engineer (SRE) at GHGSat, you'll be at the center of that mission. You'll join our Digital Infrastructure team and help build and operate the foundation that powers our satellite data processing, analytics pipelines, and customer platforms. This is not a "keep the lights on" role — you'll help evolve our infrastructure, increase reliability, and shape how modern SRE is practiced at GHGSat.

Requirements

What You'll Do:

Design and evolve our infrastructure using modern SRE principles — from infrastructure-as-code to self-healing systems and robust observability.
Operate and optimize GHGSat's cloud and on-prem services, including Kubernetes clusters, CI/CD pipelines, artifact registries, and custom workloads.
Build and refine our observability stack, including logs, metrics, traces, and actionable alerts.
Automate relentlessly — from security audits to deployment flows. If it's repetitive, it's a script waiting to happen.
Improve and secure workflows for 80+ engineers and researchers across cloud and hybrid environments.
Own and mature our IAM strategy, including auditing, lifecycle management, and tooling across Azure AD, AWS IAM, and internal systems.
Lead by example in championing best practices around security, ops hygiene, and incident readiness.
Collaborate closely with dev and research teams to understand their systems and ensure SRE supports their velocity, not slows it down.

About You:

You have 5–10 years of experience in SRE, DevOps, or Systems Engineering roles in fast-moving tech environments.
You are deeply comfortable with Linux, Kubernetes, and cloud-native infrastructure (we primarily use AWS).
You treat Infrastructure as Code (OpenTofu, Ansible, etc.) as a given, not an extra.
You have practical experience with monitoring, alerting, and incident response — and you've helped teams learn from failure.
You have a solid understanding of cybersecurity best practices — especially in securing distributed systems and developer workflows.
You've supported CI/CD, container builds, and artifact lifecycles in production environments.
You're a great communicator who understands that trust, empathy, and clarity are as important as your shell scripts.
Bonus: you're bilingual (French/English) and/or excited about space, science, and climate impact

#Engineering carrières

Postulez dès maintenant

Enregistrer

Senior Site Reliability Specialist (SRE)

Partager un emploi :

Nous avons mis à jour nos conditions