Member of Technical Staff - Multi-Modal, Vision
Liquid AIResearch & EngineeringPosted 24 February 2026
Job Description
About Liquid AISpun out of MIT CSAIL, we build general-purpose AI systems that run efficiently across deployment targets, from data center accelerators to on-device hardware, ensuring low latency, minimal memory usage, privacy, and reliability. We partner with enterprises across consumer electronics, automotive, life sciences, and financial services. We are scaling rapidly and need exceptional people to help us get there.The OpportunityThe VLM team builds vision-language models that run on-device, under tight latency and memory constraints, without sacrificing quality. We have released four best-in-class models and we're just getting started.This team owns the full VLM pipeline end-to-end: from researching new architectures and training algorithms through data curation, evaluation, and deployment. You'll join a focused, hands-on group that works directly on models and collaborates closely with our pretraining, post-training, and infrastructure teams. Success here is measured by the capability of the models we ship.Minimal qualifications:Hands-on experience in training or evaluating VLMs with demonstrated experimental rigor.Ability to turn research ideas into scalable implementations, refine and iterate through hypotheses.Proficiency in Python and at least one deep learning framework.M.S. or Ph.D. in Computer Science, Mathematics, or a related field; or equivalent industry experience.This role is for you if you have experience in some of the following:Building or optimizing multimodal training or data pipelines.Experience with distributed training (DeepSpeed, FSDP, Megatron-LM, etc.).Multimodal post-training experience (SFT, preference optimization, RL-style methods).Dataset design and data quality expertise (quality and diversity assessment, long-tail mining).Prior open-source contributions (code, data, models) on GitHub or Hugging Face.Published research at top AI conferences (NeurIPS, ICML, CVPR, ECCV, ICLR, ACL, etc.).Experience with computer vision or visual representation learning.What working here might look like:Lead a new model capability end-to-end from task spec through data curation, training recipe, ablations, evaluation, and into the final shipped model.Improve visual reasoning through reinforcement learning and preference optimization methods.Push the quality-efficiency frontier on token efficiency via encoder/connector design. Exemplary outcome: a connector that cuts vision tokens without quality loss.What Success Looks Like (Year One):The VLM models we ship are state-of-the-art.You own a major work-stream (for instance, video understanding, preference data quality, or encoder architecture) end-to-end.At least one model has shipped to production with your direct contribution.What We Offer:Full ownership: You own your work from architecture to deployment.Compensation: Competitive base salary with equity in a unicorn-stage companyHealth: We pay 100% of medical, dental, and vision premiums for employees and dependentsFinancial: 401(k) matching up to 4% of base payTime Off: Unlimited PTO plus company-wide Refill Days throughout the year
Apply Now
Direct link to company career page
More jobs at Liquid AI
See all →Member of Technical Staff - ML Research Engineer, Multi-Modal - Audio
Research & Engineering · 24 February 2026
Member of Technical Staff - Post Training, Applied
Research & Engineering · 24 February 2026
Member of Technical Staff - ML Research Engineer, Data
Research & Engineering · 24 February 2026
Member of Technical Staff - Post Training, Reinforcement Learning
Research & Engineering · 24 February 2026
More Python jobs
See all →[Summer 2026] People Science - PhD Intern
Roblox · San Mateo, CA, United States
Team Lead - Security Platform
Cloudflare · Distributed; Hybrid
Sr. Security Software Engineer, Applied Computing (Starshield)
SpaceX · Hawthorne, CA
Security Software Engineer, Applied Computing (Starshield)
SpaceX · Washington, DC