ML/AI Research Engineer — Agentic AI Lab (Founding Team)
FabrionSan Francisco Bay AreaPosted 1 April 2026
Tech Stack
Job Description
ML/AI Research Engineer — Agentic AI Lab (Founding Team)
ML/AI RESEARCH ENGINEER — AGENTIC AI LAB (FOUNDING TEAM)
Location: San Francisco Bay Area
Type: Full-Time
Compensation: Competitive salary + meaningful equity (founding tier)
Backed by 8VC, we're building a world-class team to tackle one of the industry’s most critical infrastructure problems.
About the Role
We’re designing the future of enterprise AI infrastructure — grounded in agents, retrieval-augmented generation (RAG), knowledge graphs, and multi-tenant governance.
We’re looking for an ML/AI Research Engineer to join our AI Lab and lead the design, training, evaluation, and optimization of agent-native AI models. You'll work at the intersection of LLMs, vector search, graph reasoning, and reinforcement learning — building the intelligence layer that sits on top of our enterprise data fabric.
This isn’t a prompt engineer role. It’s full-cycle ML: from data curation and fine-tuning to evaluation, interpretability, and deployment — with cost-awareness, alignment, and agent coordination all in scope.
Core Responsibilities
- Fine-tune and evaluate open-source LLMs (e.g. LLaMA 3, Mistral, Falcon, Mixtral) for enterprise use cases with both structured and unstructured data
- Build and optimize RAG pipelines using LangChain, LangGraph, LlamaIndex, or Dust — integrated with our vector DBs and internal knowledge graph
- Train agent architectures (ReAct, AutoGPT, BabyAGI, OpenAgents) using enterprise task data
- Develop embedding-based memory and retrieval chains with token-efficient chunking strategies
- Create reinforcement learning pipelines to optimize agent behaviors (e.g. RLHF, DPO, PPO)
- Establish scalable evaluation harnesses for LLM and agent performance, including synthetic evals, trace capture, and explainability tools
- Contribute to model observability, drift detection, error classification, and alignment
- Optimize inference latency and GPU resource utilization across cloud and on-prem environments
Desired Experience
Model Training:
- Deep experience fine-tuning open-source LLMs using HuggingFace Transformers, DeepSpeed, vLLM, FSDP, LoRA/QLoRA
- Worked with both base and instruction-tuned models; familiar with SFT, RLHF, DPO pipelines
- Comfortable building and maintaining custom training datasets, filters, and eval splits
- Understand tradeoffs in batch size, token window, optimizer, precision (FP16, bfloat16), and quantization
RAG + Knowledge Graphs:
- Experience building enterprise-grade RAG pipelines integrated with real-time or contextual data
- Familiar with LangChain, LangGraph, LlamaIndex, and open-source vector DBs (Weaviate, Qdrant, FAISS)
- Experience grounding models with structured data (SQL, graph, metadata) + unstructured sources
- Bonus: Worked with Neo4j, Puppygraph, RDF, OWL, or other semantic modeling systems
Agent Intelligence:
- Experience training or customizing agent frameworks with multi-step reasoning and memory
- Understand common agent loop patterns (e.g. Plan→Act→Reflect), memory recall, and tools
- Familiar with self-correction, multi-agent communication, and agent ops logging
Optimization:
- Strong background in token cost optimization, chunking strategies, reranking (e.g. Cohere, Jina), compression, and retrieval latency tuning
- Experience running models under quantized (int4/int8) or multi-GPU settings with inference tuning (vLLM, TGI)
Preferred Tech Stack
- LLM Training & Inference: HuggingFace Transformers, DeepSpeed, vLLM, FlashAttention, FSDP, LoRA
- Agent Orchestration: LangChain, LangGraph, ReAct, OpenAgents, LlamaIndex
- Vector DBs: Weaviate, Qdrant, FAISS, Pinecone, Chroma
- Graph Knowledge Systems: Neo4j, Puppygraph, RDF, Gremlin, JSON-LD
- Storage & Access: Iceberg, DuckDB, Postgres, Parquet, Delta Lake
- Evaluation: OpenLLM Evals, Trulens, Ragas, LangSmith, Weight & Biases
- Compute: Ray, Kubernetes, TGI, Sagemaker, LambdaLabs, Modal
- Languages: Python (core), optionally Rus ... (truncated, view full listing at source)
Apply Now
Direct link to company career page
AI Resume Fit Check
See exactly which skills you match and which are missing before you apply. Free, instant, no spam.
Check my resume fitFree · No credit card
More jobs at Fabrion
See all →More React jobs
See all →Staff Software Engineer
LogicMonitor · Pune, India
Software Engineer, Backend
Opto Investments · New York, New York, United States; San Francisco, California, United States
Sr. UI Engineer, AI
LogicMonitor · Pune/Bengaluru, India
Software Engineer, Product
Opto Investments · New York, New York, United States; San Francisco, California, United States