Distinguished Engineer - Inference Serving Network and Storage

Austin, Texas, United StatesPosted 3 April 2026

Tech Stack

Job Description

About us Graphcore is a globally recognized leader in Artificial Intelligence computing systems. The company designs advanced semiconductors and data center hardware that provide the specialized processing power needed to drive AI innovation, while delivering the efficiency required to support its broader adoption. As part of the SoftBank Group, Graphcore is a member of an elite family of companies responsible for some of the world’s most transformative technologies. Job Summary We are seeking a Distinguished Engineer to lead the networking and storage architecture for a new inference serving initiative. This is a chief technologist role for the serving fabric and data path, responsible for defining and driving the end-to-end strategy for networking, storage, observability, provisioning, and automation in support of large-scale AI inference services. You will shape core technical decisions that directly influence product capability, service differentiation, and competitive advantage. On the networking side, you will lead the design of the serving fabric, inter-partition latency path, management network, QoS and transport tuning, segmentation, observability, and automation. In terms of storage, you will define the architecture for model artifact storage, checkpoint distribution, KV and session tiering and restore, telemetry and log storage, and backup and disaster recovery. Storage is expected to be a critical component of inference serving at scale, particularly for KV cache management, state movement, and service resiliency. You will therefore set technical direction across both networking and storage domains as first-class pillars of the platform. This is a Grade 7 role for a recognized expert and thought leader who can convert strategic thinking into tangible group-level impact, lead a small team, and have influence across functions and external partners. The Team You will be in the System Engineering group and work across organizational boundaries with ML software, applied AI, hardware and systems, inference service teams, and other platform and infrastructure groups. You will also engage closely with external partners responsible for key elements of the inference service stack, as well as business counterparts who depend on differentiated service capabilities, reliability, and scale. This role requires strong technical leadership without relying solely on formal authority. You will be expected to align stakeholders, make architectural trade-offs clear, and drive execution across multiple teams while raising the technical bar for the broader organization. Responsibilities and Duties Define and coordinate the networking architecture for inference serving, including serving fabric build, inter-partition latency path optimization, and management network architecture. Lead the strategy for QoS, transport tuning, traffic isolation, segmentation, and service differentiation to support multiple inference SLAs and workload classes. Drive the build of monitoring, resource prioritization, and automated management frameworks for network and storage systems at production scale. Define the storage architecture for model artifact repositories, checkpoint distribution, session state, telemetry and log storage, backup, and disaster recovery. Lead the design of KV cache storage, tiering, restore, and movement mechanisms as a core platform capability for large-scale inference serving. Optimize network and storage subsystems for demanding AI and HPC workloads, balancing throughput, latency, resiliency, cost, and operational simplicity. Work with ML software and inference service teams to develop infrastructure that supports current methods for deploying large language models. Methods include disaggregated prefill/decode paths, continuous batching, and large-model scaling techniques. Guide architecture for scaling models that use tensor, pipeline, expert, and other parallelism strategies, ensuring the serving infrastructure ... (truncated, view full listing at source)

Apply Now

Direct link to company career page

More jobs atGraphcore

AI Resume Fit Check

See exactly which skills you match and which are missing before you apply. Free, instant, no spam.

Check my resume fit

Free · No credit card