Senior Backend Engineer - Catalog
DataHubPalo Alto, California, United States$225k – $300kPosted 9 March 2026
Job Description
DataHub is an AI Data Context Platform adopted by over 3,000 enterprises, including Apple, CVS Health, Netflix, and Visa. Innovated jointly with a thriving open-source community of 13,000+ members, DataHub's metadata graph provides in-depth context of AI and data assets with best-in-class scalability and extensibility.
The company's enterprise SaaS offering, DataHub Cloud, delivers a fully managed solution with AI-powered discovery, observability, and governance capabilities. Organizations rely on DataHub solutions to accelerate time-to-value from their data investments, ensure AI system reliability, and implement unified governance, enabling AI data to work together and bring order to data chaos.
The Challenge
As AI and data products become business-critical, enterprises face a metadata crisis:
No unified way to track the complex data supply chain feeding AI systems
Engineering teams struggling with data discovery, lineage, and governance
Organizations needing machine-scale metadata management, not just human-browsable catalogs
Why This Matters
This is where infrastructure meets impact. The metadata layer you'll build will directly power the next generation of AI systems at massive scale. Your code will determine how safely and effectively thousands of organizations deploy AI, affecting millions of users worldwide.
The Role
We're looking for an exceptional Backend engineer to lead development of DataHub's Platform framework – the core that connects diverse data systems and powers our metadata collection capabilities.
You'll Build
Scalable, fault-tolerant ingestion systems for enterprise-scale metadata
Clean, intuitive APIs for our connector ecosystem
Event-driven architectures for real-time metadata processing
Schema mapping between diverse systems and DataHub's unified model
Versioning systems for AI assets (training data, model weights, embeddings)
You Have
4+ years building production-grade distributed systems
Advanced Python and API design expertise
Experience with high-scale data processing or integration frameworks
Strong systems knowledge and distributed architecture experience
Proven track record solving complex technical challenges
Built and maintained online applications serving live traffic at scale (100+ QPS)
Set up monitoring and alerting for services
Designed indexing, storage, and data architectures to make large-scale data accessible to online services
Designed and scaled distributed systems
Hands-on experience developing in a tight loop with LLMs and applying best practices for scalable LLM development
Languages
One of Java/Scala/Kotlin/C#/Go - very strong nice-to-have / borderline must-have
Python/TypeScript/Node.js - nice-to-have
Technical Skills
AWS
Kubernetes/Docker
CI/CD deployment pipelines
Microservice Architecture
Bonus Points
Experience with DataHub or similar metadata/ETL frameworks (Airflow, Airbyte, dbt)
Open-source contributions
Experience building and maintaining services that make calls to LLMs in order to serve live traffic
Experience fine-tuning LLM-powered applications exposed to end users
Early-stage startup experience
Location and Compensation
Bay Area (hybrid, 3 days in Palo Alto office)
Salary Range: $225,000 to $300,000
Benefits and Perks
We invest in people so they can do their best work and enjoy doing it. Our benefits reflect the way we build: practical, thoughtful, and designed to support long-term growth.
Competitive compensation
We offer salaries that reflect your skills, experience, and the impact you make. You bring value—we make sure you're recognized for it.
Equity for everyone
Every team member receives an ownership stake in the company. When we grow, you grow with us.
Remote Work
All roles are remote unless otherwise specified in the job description. Review the job description to confirm if the role you are interested in is remote or hybrid.
Location flexibility
Home office, coworking space, or something in between? We s ... (truncated, view full listing at source)
Apply Now
Direct link to company career page
More jobs at DataHub
See all →Senior Software Engineer/ Tech Lead - Ingestion and Integrations
Bengaluru, Karnataka, India · 12 March 2026
AI Tech Lead, Bay Area Hybrid
Palo Alto, California, United States · 9 March 2026
Data Platform Engineer
Bengaluru, Karnataka, India · 9 March 2026
Senior Software Engineer/ Tech lead - SaaS Platform & Product, OSS Product
Bengaluru, Karnataka, India · 9 March 2026