Lead Inference Platform Support Engineer - AI I
Thomson ReutersRemotePosted 5 May 2026
Tech Stack
Job Description
New Position: This position is open due to an existing vacancy to support our evolving business needs.
Thomson Reuters is seeking a Lead Inference Platform Engineer. This role is for someone who has s pecialized experience in machine learning/deep learning domains such as model compression, hardware aware model optimizations, hardware accelerators architecture, GPU/ASIC architecture, machine learning compilers, high performance computing, performance optimizations, numerics or SW/HW co-design.
About the Role
As a Lead Inference Platform Engineer , you will:
Optimize LLMs and ML models for high-performance inference using techniques such as quantization, pruning, distillation, and hardware specific tuning
Deploy and scale inference workloads on GPUs across AWS, Azure, GCP and internal Kubernetes clusters, ensuring predictable performance during peak traffic hours, especially during business hours
Implement routing and failover strategies for OpenAI/Anthropic/Vertex AI traffic
Integrate models into production grade APIs supporting TR products and enterprise workflows.
Develop highly optimized environment and eliminate performance bottlenecks to reduce latency
Collaborate with Platform Engineering teams (Landing Zones, Network, Storage, Compute, AI) to ensure inference workloads align with TR’s cloud native patterns (AWS, Azure, GCP, OCI)
Build and optimize containerized inference pipelines using Kubernetes for large‑scale distributed workloads
Ensure compliance with TR’s AI standards for deployment, monitoring, governance, and drift detection
Profile inference performance, identify GPU/CPU bottlenecks, and optimize compute utilization across heterogeneous hardware
Implement observability and health monitoring for inference pipelines, ensuring reliability of enterprise AI services.
Collaborate with platform teams to enhance capacity forecasting for AI workloads
Work with Product, Data Science, Architecture, and Enterprise AI teams to onboard new research models into production
Collaborates closely with AI engineers to invent new quantization techniques, improve numerical precision, and explore non‑standard architectures.
Partner with Cloud Engineers (Azure, AWS, GCP) to develop guardrails and automation that support inference workload.
Support the scale out of AI infrastructure during critical releases and global product rollouts.
About You
You are a potential fit for the role, Lead Inference Platform Engineer , if your background includes:
Required Skills & Qualifications
Strong understanding of ML/LLM fundamentals and inference optimization techniques.
Hands‑on experience with GPU programming (CUDA preferred), inference runtimes (TensorRT, ONNX Runtime), and deep learning frameworks (PyTorch/TensorFlow)
Proficiency in Python and at least one systems language (C strongly preferred for performance critical inference paths)
Experience deploying AI workloads to AWS/GCP/Azure and Kubernetes
Familiarity with vector search systems (OpenSearch vectors) and retrieval augmented generation pipelines
Knowledge of distributed systems, microservices, CI/CD, and cloud native architecture
Experience with AI networks, such as CNNs, transformers, and diffusion model architectures, and their performance characteristics
Understanding of GPU, Multithreading and/or other accelerators with vectorized instructions
Specialized experience in one or more of the following machine learning/deep learning domains: Model compression, hardware aware model optimizations, hardware accelerators architecture, GPU/ASIC architecture, machine learning compilers, high performance computing, performance optimizations, numerics and SW/HW co-design.
Preferred Qualifications
3 years production experience deploying ML/LLM models at scale
Experience in managing GPU fleets or inference clusters across public cloud and container platforms.
Experience supporting enterprise grade AI workloads in regulated or compliance heavy environments.
#LI- ... (truncated, view full listing at source)
Apply Now
Direct link to company career page
AI Resume Fit Check
See exactly which skills you match and which are missing before you apply. Free, instant, no spam.
Check my resume fitFree · No credit card
More jobs at Thomson Reuters
See all →More Python jobs
See all →Sr. Splunk Engineer | Remote, USA
Optiv · 14 Locations
Professional Services Engineer, CX (Data Engineer)
NICE Actimize · Philippines - Manila
Professional Services Engineer - Implementation
NICE Actimize · Philippines - Manila
Professional Services Engineer - Implementation
NICE Actimize · Philippines - Manila