Observability Platform Engineer at Amex GBT designing observability platforms using tools like ELK Stack and New Relic. Collaborating with teams to enhance system reliability and performance metrics.
Responsibilities
Design and deploy observability platforms using industry-leading tools such as ELK Stack, New Relic, Datadog, and Alert site
Develop and maintain monitoring strategies, dashboards, and alerting rules to ensure system reliability and performance
Collaborate with engineering teams to instrument applications and infrastructure for comprehensive observability
Troubleshoot complex system issues using observability data and provide actionable insights
Establish best practices for logging, metrics collection, and distributed tracing
Optimize observability infrastructure for cost-efficiency and performance
Conduct training and knowledge-sharing sessions with development and operations teams
Participate in on-call rotations and incident response activities
Continuously evaluate and recommend new observability tools and technologies
Requirements
5+ years of experience in platform engineering, DevOps, or systems engineering roles
Hands-on expertise with at least two of the following platforms: ELK Stack, New Relic, Datadog, or Alertsite
Strong understanding of monitoring, logging, metrics, and alerting concepts
Proven experience creating and maintaining monitoring dashboards and visualizations
Hands-on experience implementing synthetic monitoring and end-to-end transaction monitoring, Application Performance Monitoring (APM) concepts and implementation, Real User Monitoring (RUM) and digital/browser/mobile app observability
Knowledge of SLI/SLO definition and measurement methodologies
Familiarity with MTTA, MTTR, MTTD, and other incident metrics
Proficiency in scripting languages (Python, Bash, or similar)
Experience with cloud platforms (AWS, Azure, or GCP)
Knowledge of containerization and orchestration technologies (Docker, Kubernetes)
Lead Platform Engineer at Capital One driving transformation in technology and solutions with Agile practices and DevOps tools. Collaborating on complex technical problems in a fast - paced environment.
Data Platform Engineer managing daily operations of data platforms for a global cybersecurity company. Collaborating with teams to ensure platform reliability and performance.
Senior Platform Engineer focused on building internal platform capabilities for developer tooling and experience at MONY Group. Collaborating with teams to enhance platform engineering and software delivery.
Databricks Platform Engineer working on AWS ecosystem design, build, and optimization. Responsible for implementing scalable pipeline solutions across data platforms.
Senior Data & Platform Support Engineer supporting Oracle databases at the Federal Reserve Bank. Collaborating with teams to ensure operability of payment systems and enhance business outcomes.
IT Project Manager involved in managing diverse projects at Fidelity focusing on architecture and data solutions. Lead delivery teams in technology initiatives enhancing existing systems.
Data Platform Engineer transforming operational data into clean, analysis - ready datasets for Versana's platform. Collaborating within a team to implement data engineering practices and ensuring data quality standards.
Head of Platform Engineering overseeing SaaS platform development for mobility payments in Denmark. Leading a talented team in a hybrid work environment with a focus on infrastructure and support.