Performance Reliability Engineer
Cerebras SystemsSunnyvale, CA; Toronto, Ontario, CanadaPosted 1 March 2026
Job Description
<div class="content-intro"><p><span data-contrast="none">Cerebras Systems builds the world's largest AI chip, 56 times larger than GPUs. Our novel wafer-scale architecture provides the AI compute power of dozens of GPUs on a single chip, with the programming simplicity of a single device. This approach allows Cerebras to deliver industry-leading training and inference speeds and empowers machine learning users to effortlessly run large-scale ML applications, without the hassle of managing hundreds of GPUs or TPUs. </span><span data-ccp-props="{"134233117":false,"134233118":false,"201341983":0,"335559685":0,"335559737":240,"335559738":240,"335559739":240,"335559740":279}"> </span></p>
<p>Cerebras' current customers include top model labs, global enterprises, and cutting-edge AI-native startups. <a href="https://openai.com/index/cerebras-partnership/">OpenAI recently announced a multi-year partnership with Cerebras</a>, to deploy 750 megawatts of scale, transforming key workloads with ultra high-speed inference. </p>
<p>Thanks to the groundbreaking wafer-scale architecture, Cerebras Inference offers the fastest Generative AI inference solution in the world, over 10 times faster than GPU-based hyperscale cloud inference services. This order of magnitude increase in speed is transforming the user experience of AI applications, unlocking real-time iteration and increasing intelligence via additional agentic computation.</p></div><div class="elementToProof"><strong>About The Role</strong></div>
<div class="elementToProof">Join Cerebras as a Performance Reliability Engineer within our innovative Co-Design and Next Generation Team. Our groundbreaking CS-3 system has set new benchmarks in high-performance ML training and inference solutions. It leverages a dinner-plate sized chip with 44GB of on-chip memory to surpass traditional hardware capabilities. This role focuses on characterizing and optimizing the performance and reliability of state-of-the-art AI models running on Cerebras' breakthrough hardware.</div>
<div class="elementToProof"> </div>
<div class="elementToProof"><strong>Responsibilities</strong></div>
<ul>
<li>Characterize and enhance the performance and reliability of advanced ML hardware/software systems, with emphasis on reducing power and thermal fluctuations.</li>
<li>Analyze ML workloads, software kernels, and hardware architecture for power and performance impacts, and synthesize high-level insights across these layers.</li>
<li>Develop creative software solutions to improve reliability and performance, collaborating cross-functionally to deploy these solutions in production.</li>
<li>Influence the design of Cerebras' next-generation AI architecture and software stack through rigorous workload analysis and computational efficiency optimization.</li>
<li>Partner with ML engineers, researchers, and reliability specialists to understand model behavior and drive system-level improvements from a software perspective.</li>
<li>Collaborate with teams in architecture, silicon, and research to advance our computational platforms and influence future system designs.</li>
</ul>
<div class="elementToProof"><strong>Skills Qualifications</strong></div>
<ul>
<li>BS, MS, or PhD in Computer Science, Electrical Engineering, or a related field.</li>
<li>3+ years of relevant experience in performance engineering, reliability, computer architecture, and/or software design.</li>
<li>Proficiency in Python or other scripting languages.</li>
<li>Experience with C/C++ and assembly programming.</li>
<li>Demonstrated expertise with system-level performance and reliability optimization.</li>
<li>Strong verbal and written communication skills.</li>
<li>Nice to have: Hands-on experience with ML models, ML frameworks, and collective communication.</li>
<li>Nice to have: Understanding of thermal management principles and power delivery for advanced semiconductors.</li>
</ul><div class="content-conclusion"><h4><strong>Why Join Cerebras</strong></h4> ... (truncated, view full listing at source)
Apply Now
Direct link to company career page
More jobs at Cerebras Systems
See all →More Python jobs
See all →AI Engineer- Gen AI/SWE- Weights & Biases
Weights and Biases · Livingston, NJ / New York, NY / San Francisco, CA / Sunnyvale, CA / Bellevue, WA / Remove - US
AI Customer Support Engineer, Tier I - Weights & Biases
Weights and Biases · Sunnyvale, CA
AI Customer Support Engineer, Tier I - W&B EMEA
Weights and Biases · London, England
Analytics Engineer - Weights & Biases
Weights and Biases · San Francisco, CA / Remote - US