Site Reliability Engineer at McKesson focusing on reliability, scalability, and performance of healthcare technology systems. Engaging in automation and monitoring to deliver exceptional user experiences.
Responsibilities
Design, implement, and maintain robust and scalable infrastructure and applications
Develop and implement automation scripts, tools, and processes to streamline operational tasks
Establish and maintain comprehensive monitoring, alerting, and logging systems
Participate in on-call rotations, respond to and resolve critical incidents
Collaborate with development teams to analyze system capacity and optimize resource utilization
Work closely with software engineers, product managers, and other SREs to promote a culture of reliability
Create and maintain clear and concise documentation for systems, processes, and incident runbooks
Contribute to the implementation and enforcement of security best practices
Requirements
Bachelor's degree in Computer Science, Engineering, or a related field (or equivalent practical experience)
2+ years of experience in a Site Reliability Engineering, DevOps, or highly related software engineering role
Strong proficiency in at least one scripting language (e.g., Python, Go, Ruby, Bash)
Hands-on experience with cloud computing platforms (e.g., AWS, Azure, GCP)
Experience with container technologies (e.g., Docker) and container orchestration platforms (e.g., Kubernetes)
Familiarity with Continuous Integration and Continuous Delivery (CI/CD) pipelines and tools
Experience with monitoring and observability tools (e.g., Datadog, Prometheus, Grafana, Splunk)
Strong understanding of Linux/Unix operating systems
Fundamental understanding of networking concepts (TCP/IP, DNS, HTTP, Load Balancing)
Excellent analytical and problem-solving skills with a proactive approach
Deployment Engineer at WRITER architecting AI solutions for enterprise customers. Collaborating with cross - functional teams to deliver impactful technologies and drive business outcomes.
DevSecOps Engineer utilizing open - source frameworks and collaboration to address client challenges at Booz Allen. Delivering user - oriented solutions consistently while mastering new tools and techniques.
DevOps Engineer designing, implementing CI/CD pipelines and supporting cloud - based solutions at eInfochips. Collaborating with QA and Engineering teams for release readiness.
DevOps Engineer III providing L3 support for Operations across Edge/on - prem and cloud environments. Building automations and handling incidents for customer deployments.
SRE leading reliability and operational excellence at a mortgage tech platform. Designing systems, tooling, and processes for managing Pylon's production systems in Palo Alto.
Senior Build & Release Engineer at GXO Logistics responsible for CI/CD solutions and build automation across various environments. Collaborating with teams for smooth software deployments and mentoring staff.
Senior Site Reliability Engineer improving the reliability of Acuity’s cloud services. Collaborating across teams to define observability standards and incident response in Cork Digital Centre of Excellence.
Azure Senior DevOps Engineer supporting critical cloud systems in the Azure Government Cloud environment. Leading CI/CD pipeline design and implementation with operational best practices.
Automation Engineer enhancing infrastructure and automating operations for client systems. Working in a complex environment oriented towards automation, security, and performance.
Graduate Reliability Engineer at GKN Aerospace enhancing operational excellence through data analysis and project participation within large structural assemblies.