Llm scientist - multimodal models & reinforcement learning

Dois Irmãos do Tocantins

Talentra

Anunciada dia 12 agosto

Descrição

Our client, an international AI development company based in New York, is currently seeking a LLM Scientist (Multimodal Models & Reinforcement Learning ) to lead the development of multimodal foundation models and reinforcement learning pipelines.

This role will focus on expanding AI capabilities through vision-language integration, reward modeling, and optimization techniques such as RLHF, RLAIF, and Direct Preference Optimization (DPO). Experience in model distillation is also key to optimizing performance and scalability.

Key Responsibilities

Multimodal Model Development:

Research and build multimodal foundation models (e.G., vision-language, audio-text)

Implement and evaluate architectures using tools like CLIP, Flamingo, or GPT-4V

Reinforcement Learning & Preference Optimization:

Design and implement RLHF (Reinforcement Learning with Human Feedback), RLAIF (AI Feedback), and DPO pipelines

Fine-tune reward models and train policies that align with human or enterprise preferences

Model Compression & Evaluation:

Apply model distillation techniques to reduce model size while preserving performance

Develop benchmarks and metrics for both multimodal and RL-based systems

Governance & Alignment:

Contribute to alignment research, interpretability techniques, and bias mitigation strategies

Ensure ethical and responsible development of deployed AI systems

Qualifications & Skills

Strong experience with multimodal models (CLIP, BLIP, Flamingo, etc.)

Hands-on expertise in RLHF, RLAIF, and Direct Preference Optimization (DPO)

Proven track record in model distillation and efficiency optimization

Proficiency in Python, PyTorch, Hugging Face, and distributed training environments

Passionate about advancing the frontier of AI safety and alignment

Strong communication skills, especially for cross-functional collaboration

Self-starter with a research-driven, detail-oriented mindset

Very strong English communication skills, both written and verbal (essential for global collaboration)

Se candidatar

Criar um alerta

Salvar