Senior AI Scientist
YouComSan Francisco (Remote)Posted 11 February 2026
Job Description
<div class="content-intro"><p><span class="AnswerParser_TextContainer__z_Iiv" data-testid="youchat-text"><strong>you.com is an AI-powered search and productivity platform designed to empower users with personalized, efficient, and trustworthy search experiences.</strong> As a cutting-edge technology company, we combine advanced AI models with user-first principles to deliver tools that enhance discovery, creativity, and productivity. At you.com, we are on a mission to create the most helpful search engine in the world—one that prioritizes transparency, privacy, and user control. </span></p>
<p><span class="AnswerParser_TextContainer__z_Iiv" data-testid="youchat-text">We’re building a team of innovators, problem-solvers, and visionaries who are passionate about shaping the future of AI and technology. </span><span class="AnswerParser_TextContainer__z_Iiv" data-testid="youchat-text"><strong>At you.com, you’ll have the opportunity to work on impactful projects, collaborate with some of the brightest minds in the industry, and grow your career in an environment that values creativity, diversity, and curiosity.</strong>&nbsp;If you’re ready to make a difference and help us revolutionize the way people search and work, we’d love to have you join us!</span></p>
<p>&nbsp;</p></div><h4 class="AnswerParser_AnswerParserH3__Mpe4s"><span style="text-decoration: underline;">About the Role</span></h4>
<p>We're hiring a Senior AI Scientist to lead the development of novel evals methodologies and customer-facing evaluation research. You'll own the full loop: from identifying gaps in how we evaluate AI quality, to inventing new evals approaches, to deploying them in customer engagements and competitive analyses. This role sits at the center of how we understand and improve our AI systems. You'll work directly with customers to understand their unique quality requirements, design evals that capture what matters, and create reusable evaluation frameworks that scale across our customer base. You'll also contribute to our evals research agenda, publishing work on evaluation methodologies for agents, RAG systems, and search-augmented AI.&nbsp;The ideal candidate brings both a researcher's rigor and a practitioner's pragmatism - comfortable writing papers on evals methodology and comfortable on sales calls explaining evaluation trade-offs to enterprise customers.</p>
<h4 class="AnswerParser_AnswerParserH3__Mpe4s"><span style="text-decoration: underline;">Responsibilities</span></h4>
<ul class="AnswerParser_AnswerParserUnorderedList__P_1FW">
<li class="AnswerParser_ListItem__XqLOV"><em><strong>Define and own what “good” means</strong> for search-augmented and agentic AI systems by designing evaluation frameworks that measure real-world quality, reliability, and user-relevant behavior beyond standard benchmarks.</em></li>
<li class="AnswerParser_ListItem__XqLOV"><em><strong>I</strong></em><em><strong>nvent and validate novel evaluation methodologies</strong> for non-deterministic systems (LLMs, agents, RAG), including behavioral evals, long-tail and adversarial test sets, and task-specific metrics.</em></li>
<li class="AnswerParser_ListItem__XqLOV"><em><strong>Develop rigorous statistical frameworks</strong> for model comparison, regression detection, and uncertainty estimation, ensuring evaluation results are defensible and decision-ready.</em></li>
<li class="AnswerParser_ListItem__XqLOV"><em><strong>Build and maintain scalable evaluation s ... (truncated, view full listing at source)
Apply Now
Direct link to company career page
More jobs at YouCom
See all →More Python jobs
See all →[Summer 2026] People Science - PhD Intern
Roblox · San Mateo, CA, United States
Team Lead - Security Platform
Cloudflare · Distributed; Hybrid
Sr. Security Software Engineer, Applied Computing (Starshield)
SpaceX · Hawthorne, CA
Security Software Engineer, Applied Computing (Starshield)
SpaceX · Washington, DC