Machine Learning Engineer, Assessments
SpeakSan FranciscoPosted 8 April 2026
Job Description
Machine Learning Engineer, Assessments
ABOUT US
Our mission is to reinvent the way people learn, starting with language.
Learning a language can change a life by opening doors to new cultures, careers, and communities. Two billion people around the world are actively trying to learn a language, but the best way to learn (one-on-one tutoring) is hard to access at scale and hasn’t been meaningfully improved in decades. Speak is building a human-level, AI-powered tutor in your pocket: a conversation-first experience that lets learners actually speak, get instant feedback, and progress through carefully designed lessons. The result is a complete path from beginner to confident speaker across multiple languages.
Speak first launched in South Korea in 2019, where Speak has now become the number one language learning app, and we now serve learners across many markets and 15+ languages. Speak is one of the world’s leading AI companies, with over $150m raised in venture investment from OpenAI, Accel, Founders Fund, Khosla Ventures, and more, with a distributed team across San Francisco, Seoul, Tokyo, Taipei, and Ljubljana.
ABOUT THIS ROLE
We’re hiring an ML Engineer, Assessments to help build best-in-class assessment systems across multiple products (Speak for Business, B2C, and new surfaces). You will work in a tight loop with our Assessment Design Lead (Content/Learning Design), Machine Learning, Product, and Engineering to turn assessment constructs and rubrics into reliable, scalable scoring + feedback systems.
This role owns the implementation, deployment, and ongoing quality of our assessment algorithms and ML systems. While there is immediate need to improve and expand production assessments, this work is also building a platform capability that can be reused across the app.
WHAT YOU’LL BE DOING
- Ship and own assessment ML systems end-to-end
- Build, deploy, and maintain scoring models/pipelines (feature extraction → model training → inference → feedback generation)
- Own monitoring, regression tests, and ongoing iteration to maintain accuracy targets
- Define and operationalize evaluation
- Implement validation/evaluation frameworks for assessments, including metrics, test sets, and offline/online analysis
- Translate assessment requirements into measurable acceptance criteria and guardrails
- Partner deeply with the Assessment Design Lead
- Co-develop the strategy, together with the Content team, to grow assessments into a core platform at Speak
- Work in a tight weekly loop to deliver incremental improvement
- Drive near-term delivery across products
- Stand up or improve summative assessments (spoken language ability) and bring them reliably to production
- Prototype and validate formative assessment approaches to measure improvement over weeks/months
- Support data and labeling strategy
- Help define data needs for training/evaluation (including psychometric measurement needs)
- Build or improve pipelines that support label collection and analysis (especially for efficacy studies)
WHAT WE’RE LOOKING FOR
- Domain expertise in spoken language proficiency assessment (linguistics, applied linguistics, pedagogy, or equivalent experience)
- Strong experience designing and running evaluation + validation for assessment/scoring systems, and tailoring approaches to a specific product use case
- 4+ years building automatic proficiency assessment systems (or equivalent depth in closely related scoring/evaluation domains)
- PhD is helpful but not required
- Proven ability to ship ML models to production (not only research), including reliability, monitoring, and iteration
- Strong generalist ML/analysis skills (statistics, Python, PyTorch/model training)
- Ability to operate cross-functionally and communicate clearly with non-technical partners (Content/LD, PM, leadership)
NICE TO HAVE
- Experience with speech/audio ML
- Experience with psychometrics concepts (reliability/validity, calibration)
HOW W ... (truncated, view full listing at source)
Apply Now
Direct link to company career page
AI Resume Fit Check
See exactly which skills you match and which are missing before you apply. Free, instant, no spam.
Check my resume fitFree · No credit card
More jobs at Speak
See all →More Python jobs
See all →Staff Software Engineer — Search Platform, API & Infrastructure
Thomson Reuters · Remote
Technology Operations Analyst
ComplyAdvantage · Cluj-Napoca, Cluj, Romania
Staff Software Engineer — Search Platform, Ingestion & Indexing
Thomson Reuters · Remote
Senior Product Test Engineer
Locus Robotics · Wilmington, MA