Site Reliability Engineer maintaining Taco Bell's Smarthub technology platform. Troubleshooting store issues and enhancing customer experience through innovative solutions.
Responsibilities
Troubleshoot and analyze store level issues.
Conduct production validation test for deployments.
Document processes, tools, and known solutions.
Participate in problem records troubleshooting bridges.
Communicate findings clearly during issue investigation.
Analyze ingested metrics to identify store or platform level issues.
Implement monitoring and alerting.
Participate in sprint planning, design, operations and deployment meetings.
Serve as SRE liaison for Platform, Service Desk and Proactive teams.
Support vendor NextGen projects and platform upgrades.
Maintain vendors build servers for smarthub in Taco Bell lab.
Validate and coordinate resolutions across teams.
Support existing tools.
Apply technical knowledge and learning to improve the tooling.
Initiate and work on projects that provide value to Engineering, SRE, or SD teams.
Requirements
Bachelor's degree in Computer Science, Engineering, or a related field.
1–3 years of experience in IT, systems engineering, DevOps, or technical support.
Experience with containerized platforms, API/Microservices and software development life cycle.
Practical knowledge working with Linux systems.
Familiarity with observability platforms such as Datadog.
Experience with automation and basic scripting using Bash or Python
Solid understanding of system monitoring principles
Strong analytical and problem-solving abilities
Demonstrated ability to learn rapidly and adapt within fast-paced environments
Strong attention to detail
Demonstrates curiosity and initiative in learning
Communicate effectively with peers and cross-functional teams
Shows ownership and follow-through on assigned tasks+
Benefits
Hybrid work schedule and year-round flex day Friday
Onsite childcare through Bright Horizons
Onsite dining center and game room (yes, there is a Taco Bell inside the building)
Onsite dry cleaning, laundry services, carwash
Onsite gym with fitness classes and personal trainer sessions
Up to 4 weeks of vacation per year plus holidays and time off for volunteering
Tuition reimbursement and education benefits
Generous parental leave for all new parents and adoption assistance program
401(k) with a 6% matching contribution from Yum! Brands with immediate vesting
Comprehensive medical & dental including prescription drug benefits and 100% preventive care
Discounts, free food, swag and… honestly, too many good benefits to name
Principal Safety and Reliability Engineer developing and supporting safety design for mission - critical aerospace systems. Engaging in design reviews and ensuring compliance with requirements.
Cloud DevOps Engineer playing a pivotal role in developing migration plans for Coast Guard Cloud Architecture. Collaborating with teams to ensure effectiveness and best practices in cloud implementation.
Reliability Engineer III at Daimler Truck developing propulsion technology solutions for electrified and conventional axle components. Leading testing and validation for complex powertrain systems.
Electrical Reliability Engineer at Marathon Petroleum maintaining electrical equipment and systems. Collaborating with cross - functional teams and ensuring compliance with electrical codes and standards.
Senior DevOps Engineer focused on GCP platform engineering at healthtech startup. Collaborating with teams to enhance compute and networking capabilities.
SME DevOps Engineer delivering enhancements for enterprise data and analytics products across DoD organizations. Collaborating with government and industry partners to translate strategic requirements into scalable solutions.
DevOps Engineer designing CI/CD pipelines and managing Azure cloud infrastructure for leading organizations. Collaborating with global teams and automating deployment processes across projects.
Senior DevOps professional at iugu managing system reliability and performance in a dynamic environment. Collaborating with development teams and automating processes for efficiency.
Site Reliability Engineer maintaining stability and availability of healthcare staffing platform while collaborating with engineering teams on AWS migration projects.
Site Reliability Engineer maintaining the ShiftKey Marketplace platform while ensuring its stability and availability. Collaborating on infrastructure projects and support with a remote - first approach.