SRE / Observability Engineer at leading financial services organization focusing on observability and reliability. Building scalable digital platforms and ensuring system stability and user experience.
Responsibilities
Contribute to the design and implementation of observability solutions (HLD, LLD)
Build and operate logging, metrics, and distributed tracing systems
Design and maintain monitoring dashboards and alerting strategies
Support incident analysis and root cause investigations
Drive improvements to system reliability using SRE principles
Define and implement observability standards and best practices
Automate monitoring and operational workflows
Collaborate with infrastructure and application teams to improve system visibility and operability
Requirements
Degree in Computer Science or a related field
2–3+ years of experience with modern observability tools (e.g. Prometheus, Grafana, ELK, Dynatrace, Splunk, OpenTelemetry)
2–3+ years of experience in infrastructure or cloud operations (on-prem and/or cloud)
Hands-on experience with containerized and cloud environments (Kubernetes, AWS, Azure)
Strong understanding of SRE principles and proactive problem-solving
Ability to analyze complex systems and identify patterns across logs, metrics, and traces
Intermediate level of English (technical communication)
Structured thinking, strong communication, and collaborative mindset
**Nice to have:**
Experience in financial or enterprise environments
Familiarity with Agile methodologies
Knowledge of large-scale integration architectures
Experience applying ML/AI in observability use cases
Benefits
Competitive compensation and comprehensive benefits package
Hybrid working model with home office flexibility
Support for professional development and continuous learning
Access to health and sports programs
Opportunity to shape observability strategy in a large-scale environment
Cloud DevOps Engineer managing Azure infrastructure at Medical Guardian. Overseeing technical operations and security response in a hybrid work environment.
SRE Linux/Unix System Administrator at Broadridge with strong Unix/Linux Bourne/Bash Scripting skills. Collaborating in a hybrid, fast - paced environment to manage critical systems.
Senior Site Reliability Engineer at Rootly embedding with teams to enhance service performance and reliability. Own CI/CD pipelines and drive capacity planning efforts in a fast - paced environment.
DevOps Engineer improving CI/CD pipelines and best practices for Datatonic's AI and data projects. Collaborate with clients to enhance infrastructure and drive innovation in tech.
Senior/Principal DevOps Engineer developing robust CI/CD pipelines for ClubWPT Gold at a hypergrowth startup. Collaborate globally to revolutionize online gaming experiences while maintaining high technical standards.
DevOps Engineer responsible for the health, performance, and automation of gaming platform services. Focused on CI/CD pipelines, infrastructure services, and application monitoring.
Senior Principal SRE at Northern Trust, ensuring reliability and performance of global systems. Leading observability and automation initiatives while collaborating across teams.
Site Reliability Engineer owning the internal developer platform reliability at e - conomic. Collaborating with a cross - functional DevEx team to enhance developer productivity in Copenhagen.