We use cookies. Find out more about it here. By continuing to browse this site you are agreeing to our use of cookies.
#alert
Back to search results
New

Lead Site Reliability Engineer, Factory Software

Tesla Motors, Inc.
200,000 - 300,000 USD
paid holidays, flex time, 401(k)
United States, California, Fremont
Jun 03, 2026
What to Expect
The Factory Software team at Tesla is building critical applications to enable manufacturing and warehouse management with a strong emphasis on reliability, availability, scalability, speed, and security. We are a diverse, cross-functional team of Controls Engineers, Software Engineers, SREs, and other disciplines working on automated manufacturing and warehouse processes.
This is a technical leadership role. As the Lead Site Reliability Engineer, you will be the primary technical owner and leader for the Factory Software team's reliability, observability, and infrastructure strategy. You will combine deep hands-on engineering with leadership to set technical direction, raise the bar on engineering practices, and ensure the full stack - from Kubernetes clusters and databases to factory-facing applications - is highly reliable, observable, and performant.

What You'll Do
  • Provide technical leadership and set the vision for observability, reliability, and platform standardization across the Factory Software team
  • Design and implement end-to-end observability and telemetry solutions (OTEL, Prometheus, Grafana, Tempo, etc.) while mentoring the team on best practices
  • Own the reliability of the full stack: Kubernetes infrastructure, virtual machines, databases, and the middleware applications connecting PLCs, MES systems, and other factory services
  • Define and drive SLIs, SLOs, error budgets, and golden signals across services
  • Lead major initiatives to eliminate speed bottlenecks, database contention, and infrastructure issues through proactive monitoring and automation
  • Write production-grade code and build tools to reduce toil and improve deployment, monitoring, and operational workflows
  • Participate hands-on in on-call rotations, live troubleshooting during outages (NOC bridges), and blameless post-mortems
  • Collaborate closely with Platform Engineering, Infrastructure, Controls Engineering, and Software Engineering teams to embed reliability and observability into architecture and development practices
  • Mentor and coach engineers on technical excellence, observability, Kubernetes, Linux, networking, and reliable system design
  • Drive continuous improvement in incident response, system performance, and engineering standards across the team

What You'll Bring
  • 7+ years of experience in Site Reliability Engineering, Platform Engineering, or related systems roles, with significant hands-on experience at scale
  • Strong technical expertise in Kubernetes, Docker, Linux administration, and networking (routing, VLANs, firewalls, load balancers)
  • Deep experience with observability tools and concepts (Prometheus, Grafana, Tempo, OTEL, Splunk, etc.)
  • Proven track record of designing and implementing reliable, observable distributed systems
  • Proficiency in at least one high-level language (Go, Python, or Java) with experience writing production-grade code
  • Demonstrated ability to lead technical initiatives and raise the engineering bar without formal people management authority
  • Experience with on-call rotations, incident command, and driving reliability improvements through blameless post-mortems
  • Strong bias for action, excellent communication skills, and a desire to mentor and uplift other engineers
  • Experience in manufacturing, industrial automation, or complex operational environments is a strong plus

Compensation and Benefits
Benefits

Along with competitive pay, as a full-time Tesla employee, you are eligible for the following benefits at day 1 of hire:

  • Medical plans > plan options with $0 payroll deduction
  • Family-building, fertility, adoption and surrogacy benefits
  • Dental (including orthodontic coverage) and vision plans, both have options with a $0 paycheck contribution
  • Company Paid (Health Savings Accounts) HSA Contribution when enrolled in the High-Deductible medical plan with HSA
  • Healthcare and Dependent Care Flexible Spending Accounts (FSA)
  • 401(k) with employer match, Employee Stock Purchase Plans, and other financial benefits
  • Company paid Basic Life, AD&D
  • Short-term and long-term disability insurance (90 day waiting period)
  • Employee Assistance Program
  • Sick and Vacation time (Flex time for salary positions, Accrued hours for Hourly positions), and Paid Holidays
  • Back-up childcare and parenting support resources
  • Voluntary benefits to include: critical illness, hospital indemnity, accident insurance, theft & legal services, and pet insurance
  • Weight Loss and Tobacco Cessation Programs
  • Tesla Babies program
  • Commuter benefits
  • Employee discounts and perks program
    Expected Compensation
    $200,000 - $300,000/annual salary + cash and stock awards + benefits

    Pay offered may vary depending on multiple individualized factors, including market location, job-related knowledge, skills, and experience. The total compensation package for this position may also include other elements dependent on the position offered. Details of participation in these benefit plans will be provided if an employee receives an offer of employment.

    Applied = 0

    (web-77cf7d65c7-tswzx)