Senior Engineer 2: Inference Optimizations

DigitalOcean
Austin$167k – $209kPosted 17 March 2026

Job Description

Dive in and do the best work of your career at DigitalOcean. Journey alongside a strong community of top talent who are relentless in their drive to build the simplest scalable cloud. If you have a growth mindset, naturally like to think big and bold, and are energized by the fast-paced environment of a true industry disruptor, you’ll find your place here. We value winning together—while learning, having fun, and making a profound difference for the dreamers and builders in the world. DigitalOcean is seeking a Senior Engineer 2 to play a key technical role in our AI Inference Optimization team. DigitalOcean aims to be the Inference Cloud of choice for digitally native companies and you will help ensure we can offer the industry-leading performance for our inference services. You will be responsible for the architectural decisions that maximize throughput and minimize latency for the world’s most advanced large models. As an IC leader, you will act as a force multiplier for the engineering organization, solving the most complex bottlenecks in memory bandwidth and compute utilization while guiding the technical roadmap for our high-performance inference fleet. What You’ll Do: Performance Architecture: Lead the technical strategy for benchmarking and performance optimizations at the inference engine and GPU kernel layers, ensuring our infrastructure extracts maximum value from every TFLOP. Deep-Dive Optimization: Engineer solutions for complex performance issues, including attention layer optimizations, memory and precision management, and advanced parallelization across multi-node GPU clusters. Technological Innovation: Proactively implement cutting-edge optimization techniques to keep DigitalOcean at the forefront of the Gen AI landscape. Some examples of projects you may work on: Improving batch size performance using AMD's AITER library for AMD MI355X - identify and tune AITER's CK (composable kernel) or ASK (assembly) to optimize FP8 / BF16 Identify kernel fusion opportunities for GLM-5 kernels for different layers of the Transformer block (FlashAttention, RMS Norm) Tune expert gateway router kernels for MoE models like Qwen3-235B, DeepSeek V3, GLM-5 etc Hardware Ecosystem Mastery: Act as the subject matter expert on modern GPU families (NVIDIA/AMD) and their software stacks (CUDA, ROCm, TensorRT, OpenAI Triton), advising on hardware procurement and software integration. Technical Mentorship: Lead by example through high-quality code and design reviews, elevating the technical bar for the team without the administrative overhead of direct management. Strategic Collaboration: Partner with Product Management and TPMs to translate "theoretical hardware limits" into "shippable product features," ensuring our platform is both powerful and developer-friendly. Community Leadership: Maintain a strong presence in the GPU infrastructure and model performance optimization communities, contributing to and integrating the best of open-source AI. What You’ll Bring to DigitalOcean: Technical Depth: 5+ years of experience in high-performance computing or AI infrastructure, with a proven track record of solving compute utilization and memory bandwidth bottlenecks. Gen AI Literacy: Deep familiarity with the Gen AI (LLM, VLM, LMM) landscape, including the specific quirks and architectural requirements of major model families. Optimization Expert: Hands-on experience with attention-layer optimizations and parallelization strategies across distributed GPU environments. Hardware Fluency: Comprehensive understanding of NVIDIA and AMD GPU architectures and their respective software ecosystems (CUDA, ROCm, etc.). Open Source Mastery: Extensive experience integrating, building with, and contributing to open-source software projects. Systems Design: Excellent system design skills, particularly related to low-level GPU programming - optimization, memory access patterns, and parallel execution. Leadership through Inf ... (truncated, view full listing at source)
Apply Now

Direct link to company career page

AI Resume Fit Check

See exactly which skills you match and which are missing before you apply. Free, instant, no spam.

Check my resume fit

Free · No credit card

Share