SRE leading reliability and operational excellence at a mortgage tech platform. Designing systems, tooling, and processes for managing Pylon's production systems in Palo Alto.
Responsibilities
You'll own reliability and operational excellence for Pylon's production systems.
Designing and implementing monitoring, alerting, and incident response processes that scale as we grow.
Building tooling that makes the entire engineering team more effective.
Establish on-call rotations and runbooks.
Ensure our platform can handle the demands of a regulated, high-stakes financial product.
Spend 50%+ of your time writing code: building infrastructure tooling, automating operational burden, making reliability improvements, and productivity tools.
Requirements
4+ years experience in SRE, infrastructure, or platform engineering roles
Experience working on a team of SREs at a company with mature SRE practices (not solo SRE roles)
Real on-call experience at scale in a large production environment (you've carried the pager and lived through incidents)
Deep AWS expertise (ECS, RDS, networking, security)
Strong experience with declarative infrastructure (Terraform, CDK, or similar)
Nix experience (we use it and want to expand its adoption)
Track record of building reliability tooling and automation
Can design and implement monitoring, alerting, and observability systems from first principles
Comfortable working in a regulated environment where "breaking things" is not an option.
DevOps/IT Apprentice supporting cloud infrastructure and CI/CD pipelines at tech startup. Involves learning, taking ownership, and growing within the engineering team.
DevOps Engineer at Cloud++ collaborating on infrastructure and CI/CD pipelines across multi - cloud environments. Engaging with development teams to ensure reliable and secure releases.
Fullstack Developer at Zenika engaging in impactful tech projects like B2B platforms and architecture modernization. Collaborating with senior consultants in a quality - focused environment.
Production Engineer in a hybrid role ensuring operational performance of applications for a strategic international project. Focusing on automation and optimization within a technical environment at EOLEN.
Senior/Expert DevOps Engineer for AI project in pharmaceutical sector at GECI International. Involves designing, deploying, and operating autonomous AI agents.
DevOps Engineer Intern at Emeria Technologies focusing on Cloud infrastructure design and support. Involves maintaining CI/CD platforms and collaborating with DevOps teams for optimization.
Intern supporting software development infrastructure including CI/CD and cloud integration at Intel. Collaborating with teams to optimize development and release processes.
DevOps Engineer at NetBrain responsible for AWS cloud infrastructure and automating processes. Collaborate with development and security teams to deliver secure solutions efficiently.
Site Reliability Engineer at BlaBlaCar improving CI/CD and tooling for developer efficiency and autonomy. Collaborating with engineering teams to enhance service reliability and facilitate software development.
Deployment Engineer at WRITER architecting AI solutions for enterprise customers. Collaborating with cross - functional teams to deliver impactful technologies and drive business outcomes.