System Monitoring & Observability Engineer at SRT Marine Systems, responsible for implementing user-friendly observability solutions using Prometheus and Grafana across global systems.
Responsibilities
Design, configure, and maintain Prometheus-based monitoring solutions
Develop and manage metric exporters for application and system-level data
Optimise Prometheus scraping configurations and retention policies
Define and maintain alert rules based on SLIs/SLOs and performance baselines
Ensure alerts are actionable, with minimal false positives
Participate in on-call rotations and incident postmortems
Design and maintain Grafana dashboards for real-time operational insights
Collaborate with engineering and product teams to create tailored visualisations
Provide self-service dashboard capabilities for end users
Monitor infrastructure for uptime, latency, and throughput
Identify bottlenecks and recommend improvements
Requirements
Proven experience with Prometheus (including PromQL) and Grafana in production environments
Strong knowledge of Linux-based systems
Experience writing and optimising PromQL queries for alerts and dashboards
Familiarity with exporters (node_exporter, blackbox_exporter, custom exporters)
Understanding of alertmanager configuration and routing
Proficiency with Grafana dashboard creation and templating
Strong troubleshooting skills for infrastructure and application issues
Familiarity with containers (Docker)
Scripting skills (Bash, Python, or Go) for automation
Benefits
Highly Competitive Salary
Matched company pension contributions up to 5%
25 days annual leave rising to 28 days with service
Career development opportunities
Company “Get to know you” days
Job title
System Monitoring & Observability Engineer, Prometheus, Grafana
Reverse Engineer at Teller building APIs for connecting apps to users' financial accounts. Help crack mobile banking applications for seamless bank integrations.
Project Engineer supporting construction project teams at Fessler & Bowman. Assisting with project planning, scheduling, and management across multiple construction sites.
Lead Engineer developing AI - powered features for FIS’s cloud - based financial platform, collaborating with teams and mentoring junior engineers for architectural excellence.
Controls Engineer designing and maintaining control systems for manufacturing equipment. Involved in troubleshooting and onsite servicing for optimal operations.
Tier III VTC Engineer providing technical expertise for AT&T at customer site in Virginia. Responsible for video teleconferencing troubleshooting, installation, and design at various locations.
Lead Knowledge Engineer at S&P Global driving data transformation initiatives. Collaborating with technology teams to implement next - generation data architecture and knowledge management solutions.
Part 21 Electrical / Avionics Engineer at Boeing responsible for compliance with regulatory requirements. Supporting certification of modifications for global airline partners and collaborating with engineering teams.
Engineer designing, developing, and testing nuclear equipment and systems for Navy ships at Newport News Shipbuilding. Collaborating on safety, efficiency, and performance improvements while conducting relevant research and analysis.
Senior Forward Deployed Engineer embedding in strategic aviation operations to drive measurable impact. Working with airlines and MROs while ensuring successful adoption of AI - driven solutions and product enhancements.
Senior Geotechnical Engineer providing technical leadership and developing engineering solutions for mining projects. Collaborating with teams to ensure compliance and excellence in geotechnical engineering.