Senior Reliability Engineer to analyze, design, program, and modify software for database systems at Disney. Building, deploying, and ensuring high availability of database infrastructure.
Responsibilities
Responsible for building, deploying, and ensuring all DEEP&T database infrastructure is available 24/7/365
Leverage software development and automation to design, modernize, and deliver database infrastructure
Participates in setting the architectural direction for database platforms and projects
Manage multiple competing priorities in a fast-paced, deadline-oriented environment
Analyze, design, and deploy fault-tolerant, distributed, and highly available database infrastructure
Proactively plan and implement infrastructure changes through capacity forecasting, software release cycles, and right sizing
Provide database expertise through performance tuning, troubleshooting and administration
Develop, enhance, and adhere to engineering and administration standards
Develop automation and tooling to increase operational efficiency while ensuring system reliability and security
Build infrastructure and systems for scalability, resiliency, availability, and recovery though infrastructure as code and configuration management
Provide relevant insights of data store infrastructure through metrics, monitoring, and alerting
Maintain thorough and well-written documentation
Participate in live event support and on-call rotation
May provide oversight and direction to junior team members
Builds relationships with engineering teams and leads
Requirements
Bachelor's degree, preferably in computer science, Engineering, or related field (or equivalent experience)
5+ years of related work experience with Microsoft SQL Server, Amazon RDS for SQL Server, Azure SQL, and Azure SQL MI
Fundamental understanding of Microsoft SQL Server database internals
Experience working in Agile software development
Experience with source control management tools (Git, GitLab, GitHub)
Intermediate to advanced level of expertise in one or more programming languages such as Python, Java, or Go
General understanding and experience with Windows operating system, network, and containers
Excellent verbal and written communication skills
Experience designing and deploying fault-tolerant, distributed, and highly available database infrastructure
Experience in database availability monitoring and status reporting using native monitoring tools
Well-versed in SQL Server backup, restore, and recovery strategies
Experience keeping a large environment compliant by deploying SQL Server patches and upgrades
Experience with disaster recovery planning and implementation
Comfortable collaborating with cross-functional teams providing guidance in SQL Server best practices
Benefits
A bonus and/or long-term incentive units may be provided as part of the compensation package
The full range of medical benefits is offered
Job title
Senior Site Reliability Engineer – Database Engineering
Full - Stack Engineer enhancing engineering productivity at Fidelity. Building internal tools for SRE teams to improve operational efficiency and reliability.
DevOps Engineer at Cloudogu working with development and operations for reliable software delivery. Focusing on CI/CD, infrastructure automation, and platform services in an agile environment.
Jr. DevOps Engineer supporting and improving CI/CD pipelines and Linux systems at Swift. Collaborating with senior engineers in a hands - on learning environment.
Senior DevOps Engineer I managing automation tooling and multi - cloud infrastructure at Spring Health. Collaborating with AI and Infrastructure teams in a hybrid Seattle office.
Site Reliability Engineer for cloudified backup platform using Commvault technology at Expleo. Joining a dynamic team to ensure backup infrastructure scalability and reliability.
Site Reliability Engineer responsible for designing and maintaining scalable services with high availability. Collaborating with development teams to enhance reliability and operational excellence.
Technical Staff leading the architecture, reliability, and modernization of enterprise ALM and DevOps tools. Driving strategy and influencing product development in collaboration with various teams.
Site Reliability Engineer responsible for reliability and availability, collaborating with development teams on scalable systems. Applying software engineering practices to improve production operations.
DevOps Engineer in the Security Data and AI Lab at Lloyds Banking Group driving data and cloud infrastructure's influence on product operations and customer service improvements.