Infrastructure Software Engineer

Etched
San JosePosted 27 March 2026

Job Description

Infrastructure Software Engineer About Etched Etched is building the world’s first AI inference system purpose-built for transformers - delivering over 10x higher performance and dramatically lower cost and latency than a B200. With Etched ASICs, you can build products that would be impossible with GPUs, like real-time video generation models and extremely deep & parallel chain-of-thought reasoning agents. Backed by hundreds of millions from top-tier investors and staffed by leading engineers, Etched is redefining the infrastructure layer for the fastest growing industry in history. Job Summary Building cutting-edge model-specific ASICs requires crafting custom infrastructure and toolchains to support ultra-fast, reliable, and scalable development across the stack - from simulation to silicon. We build this infrastructure as software - and we engineer it with the same best practices we apply to our products. We use the same rigor, design discipline, and quality standards and testing as we do to our ASIC, software, and platform. You will lead the development and adoption of next-generation infrastructure tooling, enabling Etched ASIC, Software, and Platform engineers to iterate faster, build more reliably, and push the boundaries of AI performance. This includes building and scaling our hybrid high-performance compute (HPC) cluster, optimized for massively parallel CI, EDA workflows, Emulation, and hardware-aware job execution. You’ll also architect and implement a state-of-the-art observability stack with LLM integration and a strong emphasis on streaming health and performance telemetry, log aggregation, distributed tracing, insight generation, synthetic testing, and smart alerting - across CI pipelines, simulation clusters, and service endpoints. This role demands a strong software engineering mindset, quality instincts, and deep understanding of systems. It’s not just about writing scripts - it’s about writing code that builds and manages infrastructure with precision, repeatability, and intent. Key responsibilities - Design and build the orchestration layers that drive our hybrid high-performance clusters—enabling simulation, synthesis, and continuous integration of AI ASICs at unprecedented scale. - Develop and maintain a fully programmable infrastructure control plane to ensure reproducibility, auditability, and rapid iteration across the entire stack. - Create tools and abstractions that empower engineers to harness massive parallelism without worrying about the underlying complexity.. - Prototype and execute workload orchestration and migration strategies between on-premise and cloud environments, balancing performance, storage availability and replication, uptime, and cost across heterogeneous hardware and compute backends. - Implement real-time telemetry, tracing systems that surface insights from millions of metrics, enabling proactive debugging and system optimization. - Build a full observability stack that includes dashboards, alerting, automated responses, and a synthetic testing framework to proactively test infrastructure performance and reliability for various application and data flows, ensuring we remain proactive against issues impacting development and productivity workflows. Representative projects - Design and deploy a fully automated, scalable hybrid HPC cluster, combining bare-metal servers and switches with cloud instances, provisioned through MaaS and orchestrated via SLURM and Kubernetes, optimized for mixed EDA workloads and parallel CI pipelines. - Develop a real-time observability system for ASIC toolchain jobs and distributed builds, integrating Prometheus, Grafana, and VictoriaMetrics with streaming telemetry, tracing, and alerting to detect performance regressions before they hit silicon. - Architect and implement a programmable infrastructure-as-code control plane, using Terraform, Ansible, and Puppet, to version, audit, and redeploy every layer of Etched's development stac ... (truncated, view full listing at source)
Apply Now

Direct link to company career page

AI Resume Fit Check

See exactly which skills you match and which are missing before you apply. Free, instant, no spam.

Check my resume fit

Free · No credit card

Share