Senior Machine Learning Engineer (Large Systems)
GraphcoreBristol, UKPosted 21 March 2026
Job Description
About Graphcore
At Graphcore, we’re building the future of AI compute.We’re a team of semiconductor, software and AI experts, with deep experience in creating the complete AI compute stack - from silicon and software to infrastructure at datacenter scale.As part of the SoftBank Group, backed by significant long-term investment, we are delivering key technology into the fast-growing SoftBank AI ecosystem.To meet the vast and exciting AI opportunity, Graphcore is expanding its teams around the world.We are bringing together the brightest minds to solve the toughest problems, in a place where everyone has the opportunity to make an impact on the company, our products and the future of artificial intelligence.
Job Summary
As a Senior Machine Learning Engineer in the Applied AI team at Graphcore, you will contribute to advancing AI technology by developing and optimising AI models tailored to our specialised hardware. You will work on large scale systems where performance is critical to the success of our projects. Working closely with the Software development and Research teams, you will play a critical role in identifying opportunities to innovate and differentiate Graphcore’s technology. We seek engineers with strong technical skills and an understanding of AI model implementation at scale, eager to make a tangible impact in this rapidly evolving field.
The Team
The Applied AI team’s role is to be proxies for our customers, we need to understand the latest AI models, applications, and software to ensure that Graphcore’s technology works seamlessly with the AI ecosystem and at scale. We build reference applications, contribute to key software libraries e.g. optimising kernels for efficiency on our hardware, and collaborate with the Research team to develop and publish novel ideas in domains such as efficient compute, model scaling and distributed training and inference of AI models for multiple modalities and applications. If you're excited about advancing the next generation of AI models on cutting-edge hardware, we’d love to hear from you!
Responsibilities and Duties
Implement latest machine learning models and optimise them for performance and accuracy, scaling to 1000s of accelerators.
Test and evaluate new internal software releases, provide feedback to software engineering teams, make necessary code fixes, and conduct code reviews.
Benchmark models and key ML techniques to identify performance bottlenecks and improve model efficiency.
Design and conduct experiments on novel AI methods, implement them and evaluate results.
Collaborate with Research, Software, and Product teams to define, build, and test Graphcore’s next generation of AI hardware.
Engage with AI community and keep in touch with the latest developments in AI.
Candidate Profile
Essential:
Bachelor/Master's/PhD or equivalent experience in Machine Learning, Computer Science, Maths, Data Science, or related field.
Proficiency in deep learning frameworks like PyTorch/JAX.
Strong Python or C++ software development skills
Expertise in deep learning from model training to optimisation and evaluation.
Experience in distributed training or inference of ML models across 64+ accelerators.
Capable of designing, executing and reporting from ML experiments.
Developed deep understanding of performance bottlenecks and how to overcome them.
Ability to move quickly in a dynamic environment
Enjoy cross-functional work collaborating with other teams.
Strong communicator - able to explain complex technical concepts to different audiences.
Desirable:
Experience in one or more of:
MLOps for Kubernetes-based clusters
Building production systems with large language models
Efficient computing based on low-precision arithmetic.
Experience writing C++/Triton/CUDA kernels for performance optimisation of ML models.
Familiarity with HPC systems and networking including Infiniband, NVLink, RoCE technologies.
Have contributed to open-source projects or publishe ... (truncated, view full listing at source)
Apply Now
Direct link to company career page
AI Resume Fit Check
See exactly which skills you match and which are missing before you apply. Free, instant, no spam.
Check my resume fitFree · No credit card
More jobs at Graphcore
See all →More Python jobs
See all →Performance Engineer - AI Infrastructure
Andromeda Cluster · Global Remote / San Francisco, CA
Site Reliability Engineer - AI Infrastructure
Andromeda Cluster · Global Remote / San Francisco, CA
Software Engineer - AI Infrastructure
Andromeda Cluster · North America Remote / San Francisco, CA
Solutions Engineer
Andromeda Cluster · North America Remote / San Francisco, CA