Titre du poste ou emplacement

Senior RelEng/DevOps Engineer

Frugal - 3 emplois
Ottawa, ON
Publié il y a 2 jours
Détails de l'emploi :
Télétravail
Temps plein
Exécutif
Avantages :
Modalités de travail flexibles
Options d'achat d'actions

About Frugal

Frugal is an AI-powered coding agent purpose-built to tackle one of the most persistent problems in tech: runaway cloud costs. Despite years of optimization efforts, cloud expenses remain high—and with AI workloads on the rise, the problem is about to get much worse. While existing tools help right-size infrastructure, they overlook a major driver of inefficiency: the application code itself.

That's where Frugal comes in. Our agent analyzes source code, cloud billing data, and observability signals to pinpoint inefficient code patterns, recommend improvements, and even automate fixes via pull requests. Unlike traditional FinOps approaches—which struggle to engage developers—Frugal embeds directly into their workflow, making cost optimization a seamless part of the development lifecycle.&

We're looking for driven, curious, and talented people to join our team. Help us empower developers to reduce the cost of their applications.

We offer competitive salaries, benefit programs, stock options and flexible work from home options.

At this time we are focusing on hiring for our Ottawa Canada office.

About the Role

As a Senior Release Engineer / DevOps at our seed-stage SaaS startup in Ottawa (hybrid), you own the delivery pipeline from code commit to production across AWS, GCP, and Azure environments.&

You design, build, and maintain CI/CD systems, ensuring reliable, secure, and automated deployments that enable rapid feature delivery while maintaining enterprise-grade stability.&

You manage infrastructure-as-code, establishing consistent environments, monitoring stack health, and optimizing cloud resource utilization, including LLM execution, across multi-cloud deployments.&

You collaborate with other engineering team members to implement deployment strategies, rollback procedures, test automation, and release gates that minimize risk while maximizing development velocity.&

You establish observability and monitoring frameworks using Datadog, configuring metrics, dashboards, and alerting systems that provide visibility into system performance, security posture, and compliance requirements (SOC 2).&

You drive automation initiatives that eliminate manual processes, reduce deployment friction, and enable self-service capabilities for development teams.&

You partner with your team as well as the VP of Engineering and CTO to architect scalable infrastructure patterns, evaluate emerging DevOps tools, and establish operational excellence standards.&

You mentor team members on DevOps best practices, foster a culture of reliability and continuous improvement, and stay current with cloud-native technologies and security frameworks.&

This role offers significant impact as you build the foundation that enables our platform and engineering organization to scale efficiently during this critical growth phase.

Must Have:

  • 5+ years of DevOps/Site Reliability Engineering experience in production environments
  • Strong experience with at least two major cloud platforms (AWS, GCP, Azure)
  • Proficiency with Infrastructure as Code tools (Terraform, CloudFormation, or Pulumi)
  • Experience with container orchestration (Kubernetes, Docker) and service mesh technologies
  • Solid understanding of networking, security groups, load balancing, and DNS management
  • Proven experience building and maintaining CI/CD pipelines (Jenkins, GitLab CI, GitHub Actions, or similar)
  • Strong scripting skills (Python, Bash, PowerShell) for automation and tooling
  • Experience with configuration management tools (Ansible, Chef, Puppet)
  • Knowledge of deployment strategies (blue-green, canary, rolling deployments)
  • Hands-on experience with monitoring platforms (Datadog preferred, or similar tools like New Relic, Prometheus/Grafana)
  • Understanding of logging, metrics, tracing, and alerting best practices
  • Experience setting up dashboards, SLIs/SLOs, and incident response procedures
  • Knowledge of security best practices for cloud infrastructure and CI/CD pipelines
  • Understanding of secrets management, vulnerability scanning, and security automation
  • Experience working in fast-paced, early-stage environments with ambiguous requirements
  • Ability to translate business requirements into technical solutions
  • Comfort with wearing multiple hats and adapting to changing priorities
  • Self-directed work style with strong ownership mentality

Nice to Have:

  • Previous experience in SaaS startups or high-growth technology companies
  • Experience with LLM infrastructure, or AI/ML deployment pipelines
  • Previous experience with FinOps, cloud cost management, or infrastructure optimization
  • Background in performance testing and load testing frameworks
  • Experience with compliance frameworks (SOC2, FedRamp)

Partager un emploi :