Lead Inference Platform Support Engineer - AI I

RemotePosted 5 May 2026

Tech Stack

Python Rails Express AWS Azure GCP Kubernetes CI/CD TensorFlow PyTorch Machine Learning AI LLM OpenAI Vertex AI Microservices

Job Description

New Position: This position is open due to an existing vacancy to support our evolving business needs. Thomson Reuters is seeking a Lead Inference Platform Engineer. This role is for someone who has s pecialized experience in machine learning/deep learning domains such as model compression, hardware aware model optimizations, hardware accelerators architecture, GPU/ASIC architecture, machine learning compilers, high performance computing, performance optimizations, numerics or SW/HW co-design. About the Role As a Lead Inference Platform Engineer , you will: Optimize LLMs and ML models for high-performance inference using techniques such as quantization, pruning, distillation, and hardware specific tuning Deploy and scale inference workloads on GPUs across AWS, Azure, GCP and internal Kubernetes clusters, ensuring predictable performance during peak traffic hours, especially during business hours Implement routing and failover strategies for OpenAI/Anthropic/Vertex AI traffic Integrate models into production grade APIs supporting TR products and enterprise workflows. Develop highly optimized environment and eliminate performance bottlenecks to reduce latency Collaborate with Platform Engineering teams (Landing Zones, Network, Storage, Compute, AI) to ensure inference workloads align with TR’s cloud native patterns (AWS, Azure, GCP, OCI) Build and optimize containerized inference pipelines using Kubernetes for large‑scale distributed workloads Ensure compliance with TR’s AI standards for deployment, monitoring, governance, and drift detection Profile inference performance, identify GPU/CPU bottlenecks, and optimize compute utilization across heterogeneous hardware Implement observability and health monitoring for inference pipelines, ensuring reliability of enterprise AI services. Collaborate with platform teams to enhance capacity forecasting for AI workloads Work with Product, Data Science, Architecture, and Enterprise AI teams to onboard new research models into production Collaborates closely with AI engineers to invent new quantization techniques, improve numerical precision, and explore non‑standard architectures. Partner with Cloud Engineers (Azure, AWS, GCP) to develop guardrails and automation that support inference workload. Support the scale out of AI infrastructure during critical releases and global product rollouts. About You You are a potential fit for the role, Lead Inference Platform Engineer , if your background includes: Required Skills & Qualifications Strong understanding of ML/LLM fundamentals and inference optimization techniques. Hands‑on experience with GPU programming (CUDA preferred), inference runtimes (TensorRT, ONNX Runtime), and deep learning frameworks (PyTorch/TensorFlow) Proficiency in Python and at least one systems language (C strongly preferred for performance critical inference paths) Experience deploying AI workloads to AWS/GCP/Azure and Kubernetes Familiarity with vector search systems (OpenSearch vectors) and retrieval augmented generation pipelines Knowledge of distributed systems, microservices, CI/CD, and cloud native architecture Experience with AI networks, such as CNNs, transformers, and diffusion model architectures, and their performance characteristics Understanding of GPU, Multithreading and/or other accelerators with vectorized instructions Specialized experience in one or more of the following machine learning/deep learning domains: Model compression, hardware aware model optimizations, hardware accelerators architecture, GPU/ASIC architecture, machine learning compilers, high performance computing, performance optimizations, numerics and SW/HW co-design. Preferred Qualifications 3 years production experience deploying ML/LLM models at scale Experience in managing GPU fleets or inference clusters across public cloud and container platforms. Experience supporting enterprise grade AI workloads in regulated or compliance heavy environments. #LI- ... (truncated, view full listing at source)

Apply Now

Direct link to company career page

More jobs atThomson Reuters

AI Resume Fit Check

See exactly which skills you match and which are missing before you apply. Free, instant, no spam.

Check my resume fit

Free · No credit card