Senior ML Ops Engineer

  • September 16, 2025
  • Programming & Tech
  • Full time
  • Remote
  • Entry level
  • 1600 to 2000 USD
FastAPI
Large Language Model (LLM)
Machine Learning
MLOps
ONNX
English advanced
Spanish intermediate

11 talents have already applied, you are still on time!

CP
BB
JA
Share it on:

Own model serving for multiple LLM/speech models on Modal. Build and maintain the APIs around those models. Create the feedback/eval loop to improve quality while meeting strict latency/cost SLOs. Responsibilities Host and scale real-time & batch inference on Modal (autoscaling, images/volumes/secrets). Operate a multi-model fleet (versioning, routing, canaries/blue-green, traffic shaping). Ship endpoints; auth, RBAC, quotas, rate limits, telemetry. Implement feedback pipelines, online A/B evals, and guardrails with actionable alerts. Drive performance: profiling, batching, quantization, KV-cache, runtime tuning. Establish observability and reliability (OTel, metrics/logs, SLOs, runbooks, on-call). CI/CD and IaC for reproducible builds and one-click rollbacks. Must-haves 5+ years in ML Ops/Platform/SRE with production LLM/ML serving. Strong Python; high-throughput async APIs (FastAPI/Starlette) and GitHub-based CI/CD. Deep experience with vLLM, TensorRT-LLM, Triton, or ONNX Runtime. Hands-on with Modal or equivalent GPU/k8s platform. Solid observability (OTel) and incident response/postmortems. Preferred ONNX export expertise (PyTorch→ONNX), quantized/dynamic graphs, custom ops. Safety/guardrails and constrained decoding. Systems perf (CUDA/Triton kernels) or Rust for hot paths; load/chaos testing.

You might also like to apply for these jobs

Apply now
How it works for talents

Get hired with Mappa

1

Apply for a job

Our AI-powered matching algorithm considers over 100,000 data points to curate a thoroughly vetted shortlist just for you.

step-1
2

Get matched

Our AI-powered matching algorithm considers over 100,000 data points to curate a thoroughly vetted shortlist just for you.

step-2
3

Meet the company

Our AI-powered matching algorithm considers over 100,000 data points to curate a thoroughly vetted shortlist just for you.

step-3
4

Get hired

Our AI-powered matching algorithm considers over 100,000 data points to curate a thoroughly vetted shortlist just for you.

step-4

Ready to start?

Apply
Extra services

Take your international career to
the next level

Dollar payments

Get paid in US dollars while working remotely and earn ~50% more than working locally.

Career growth

Strengthen your international career by working at the most exciting companies across the US, and Europe.

Benefits

Mappa provides you with an extra annual salary. Make a difference and get rewarded for your efforts and achievements.
Get started

Secure your dream job
in just a few steps

Apply now