?? Location:
Remote - LATAM
?? Schedule:
Full-time (8 hrs/day) —
must have 4 hrs overlap with PST
? About the Role
We're looking for a
hands-on Machine Learning Engineering Manager
to lead cross-functional teams in designing, training, and deploying large-scale ML and LLM systems.
You'll drive the full lifecycle of AI development — from research and experimentation to distributed training and production deployment — while mentoring top-tier engineers and partnering closely with product, research, and infra leaders.
This role blends
deep ML/MLOps expertise
with
strong leadership and execution
, ensuring all AI initiatives translate into measurable business impact.
?? Key Responsibilities
Lead and mentor ML engineers, data scientists, and MLOps professionals.
Manage end-to-end ML/LLM project lifecycle: data pipelines, training, evaluation, deployment, and monitoring.
Provide technical direction for distributed training, large-scale model optimization, and system architecture.
Collaborate with Research, Product, and Infrastructure teams to define objectives, milestones, and KPIs.
Implement MLOps best practices: experiment tracking, CI/CD, model governance, observability.
Manage compute resources, cloud budgets, and enforce Responsible AI + data security standards.
Communicate technical progress, blockers, and results clearly to leadership and stakeholders.
?? Required Skills & Qualifications
5+ years of experience in Machine Learning, NLP, and Deep Learning (Transformers, LLMs).
2+ years leading teams delivering ML/LLM systems in production.
Strong proficiency in Python and frameworks like PyTorch, TensorFlow, Hugging Face, DeepSpeed.
Experience with distributed training, GPU/TPU optimization, and cloud platforms (AWS, GCP, Azure).
Knowledge of MLOps tools (MLflow, Kubeflow, Vertex AI, etc.).
Excellent leadership, communication, and cross-functional collaboration skills.
Bachelor's/Master's in Computer Science, Engineering, or related field (PhD preferred).
?? Nice to Have
Experience training or fine-tuning foundation models.
Contributions to open-source ML/LLM frameworks.
Knowledge of Responsible AI practices, bias mitigation, and model interpretability.