MLOps Engineer

Koda Staff
Toronto, ON
Posted today
Job Details:
Full-time
Experienced

On a mission to make AI smarter, faster, and everywhere. If you live and breathe machine learning pipelines, cloud platforms, and deployment automation, we want you to join my client's team of tech rebels.

They're not looking for a mere mortals- we want an MLOps guru ready to optimize, automate, and scale machine learning to the moon.

What you'll be doing:

  • Design, develop, and optimize ML pipelines like a wizard-ensuring training, validation, and inference run smoothly at scale.
  • Automate deployment of deep learning and generative AI models for real-time applications (because who wants to deploy manually?).
  • Implement model versioning, rollbacks, and ensure seamless updates like a smooth operator.
  • Deploy and manage ML models on cloud platforms (AWS, GCP, Azure) using your containerized magic (hello Docker and Kubernetes).
  • Optimize real-time inference performance-bring TensorRT, ONNX, and PyTorch to their full glory.
  • Work with GPU acceleration, distributed computing, and parallel processing to make AI workloads faster than a rocket.
  • Fine-tune models to slice latency and boost scalability-because who likes slow models?
  • Build and maintain CI/CD pipelines for ML models (GitHub Actions, Jenkins, ArgoCD) to make life easier.
  • Automate retraining and deployment to ensure the AI is always learning (and never slacking).
  • Develop monitoring solutions to track model drift, data integrity, and performance, because we don't believe in letting things slide.
  • Stay on top of security, data privacy, and AI ethics standards-because we care about doing things right.

What we need from you:

  • 5+ years of experience in MLOps, DevOps, or AI model deployment (you've been around the block).
  • Mastery of Python and ML frameworks like TensorFlow, PyTorch, and ONNX (you know these like the back of your hand).
  • You've deployed models using Docker, Kubernetes, and serverless architectures-it's second nature to you.
  • Hands-on experience with ML pipeline tools (ArgoWorkflow, Kubeflow, MLflow, Airflow)-you've built some mean pipelines.
  • Expertise in cloud platforms (AWS, GCP, Azure) and hosting AI/ML models like a pro.
  • GPU-based inference acceleration experience (CUDA, TensorRT, NVIDIA DeepStream)-you make inference fast!
  • Solid background in CI/CD workflows, automated testing, and deploying ML models without a hitch.
  • Real-time inference optimization? Check. Scalable ML infrastructure? Double check.
  • Excellent technical judgment-you can architect a system that works today and evolves tomorrow.
  • You're all about automation-if you can script it, you will.
  • You've got a deep understanding of distributed systems and computing architectures.
  • Self-driven with the ability to work independently and own it.
  • Experience with Kubernetes, Docker, or microservices in general-this is your bread and butter.
  • BS or MS in Computer Science or equivalent-because you've got the education (or experience) to back it up.

Nice to have:

  • Some CUDA programming skills (you've probably dabbled).
  • Experience with LLMs and generative AI models in production.
  • A bit of networking knowledge never hurt anyone.
  • Familiarity with distributed computing frameworks (Ray, Horovod, Spark)-you like to go big.
  • Edge AI deployment experience (Triton Inference Server, TFLite, CoreML)-because why not push the envelope?

Why Them?

  • Work with top-tier AI experts in a fast-growing startup.
  • Flexibility-work from anywhere, anytime, as long as you get stuff done.
  • Competitive salary plus benefits (obviously).
  • Learning culture-we provide opportunities to grow and expand your skills.
  • A work hard, play hard culture-because who says you can't have fun while crushing it?

Share This Job: