Observability Engineer - Platform Reliability (Junior to Mid-Level)

Releady - 3 Jobs

Toronto, ON

Apply Now

Posted today

Job Details:

Full-time

Student

OVERVIEW

We're seeking a Junior to Mid Level Observability Engineer (3+ years of experience) to support the tools and systems that enable platform visibility across a large-scale, cloud-native environment. This role ensures that our product and infrastructure teams can effectively monitor performance, detect issues, and maintain the reliability of the platform while continuing to ship high-quality features at speed.

If you've worked in Site Reliability Engineering (SRE), Platform Operations, or DevOps and have a strong interest in observability tooling this opportunity offers the chance to deepen your expertise in monitoring, automation, and infrastructure reliability at scale.

Open to candidates that reside in Alberta or Ontario - Fully Remote (Canada Only)

Not open to sponsorship / worker must be a T4

Contract role: 6+ months

Rate: $40 - 45 CAD hourly rate (this is the client's max budget - waiting to confirm if there is any flexibility)

RESPONSIBILITIES

Support production-grade services with a focus on availability and performance.
Help design and operate resilient systems capable of scaling with business growth.
Deploy and manage observability tools across metrics, logs, and traces.
Automate and improve processes related to infrastructure monitoring and reliability.

Potential Projects:

Diagnose and resolve performance bottlenecks or system incidents.
Build or refine automation scripts and operational workflows.
Provide input on load testing, performance tuning, and tooling enhancements.
Collaborate with engineering teams to improve visibility into their systems and services.
Support infrastructure provisioning using infrastructure-as-code practices.

QUALIFICATIONS

3+ years of experience in SRE, DevOps, or platform operations roles.
Solid background with public cloud platforms such as AWS, GCP, or Azure.
Hands-on experience with observability tools such as Datadog (preferred) or similar.
Familiarity with error monitoring tools like Sentry (preferred) or alternatives.
Proficiency in infrastructure-as-code tools like Terraform.
Experience coding in Golang, Node.js, or Python.
Proven ability to monitor and support distributed systems at scale.
Strong understanding of microservices and container orchestration using Kubernetes.
Collaborative mindset with a willingness to contribute ideas and challenge assumptions.

We are an equal opportunity employer and value diversity at our company. We do not discriminate on the basis of race, religion, color, national origin, gender, sexual orientation, age, marital status, veteran status, disability status, or other non-merit factor. We are committed to creating a diverse and inclusive environment for all employees.

#Information Technology jobs

Apply Now

Save

Observability Engineer - Platform Reliability (Junior to Mid-Level)

Share This Job:

We’ve updated our terms