Lead Observability Engineer shaping observability practices within Kobie's new Tech Hub in India. Collaborate on global projects while enhancing system reliability and performance visibility.
Responsibilities
Own and evolve the observability platform (e.g., New Relic) to provide end-to-end visibility across applications and infrastructure
Establish standards for monitoring, alerting, dashboards, and telemetry (logs, metrics, traces)
Leverage AIOps capabilities to improve anomaly detection, reduce noise, and accelerate root cause analysis
Drive automation and self-healing workflows to minimize manual intervention and improve system resilience
Collaborate across teams to ensure systems are observable by design and aligned with reliability goals
Continuously analyze system behavior and incident patterns to improve performance, scalability, and uptime
Requirements
8–10+ years of experience in observability, site reliability engineering (SRE), DevOps, or advanced production operations in large-scale enterprise environments.
Expert-level hands-on experience implementing and optimizing observability platforms such as New Relic, Datadog, Dynatrace, or Splunk.
Strong understanding of monitoring fundamentals including logs, metrics, traces, and alerting strategies.
Experience working with cloud-native architectures (AWS preferred).
Familiarity with containerized environments and orchestration platforms such as Kubernetes.
Experience integrating observability practices into CI/CD pipelines to ensure applications are observable by design.
Strong understanding of incident management, problem management, and change management practices (ITIL concepts).
Demonstrated ability to analyze telemetry data to identify patterns, detect anomalies, and improve operational reliability.
Strong leadership and collaboration skills with the ability to coordinate across engineering, DevOps, and operations teams.
Excellent communication skills and a strong focus on operational excellence and continuous improvement.
Nice to Have: Experience implementing AI/ML capabilities within observability tools for anomaly detection and predictive monitoring.
Familiarity with AIOps platforms and automated remediation workflows.
Experience with event streaming platforms such as Kafka for telemetry ingestion or real-time data processing.
Basic understanding of application architecture and troubleshooting distributed systems.
Experience with automation frameworks or serverless workflows (e.g., AWS Lambda, scripting, or infrastructure automation).
Engineer designing, developing, and testing nuclear equipment and systems for Navy ships at Newport News Shipbuilding. Collaborating on safety, efficiency, and performance improvements while conducting relevant research and analysis.
Senior Forward Deployed Engineer embedding in strategic aviation operations to drive measurable impact. Working with airlines and MROs while ensuring successful adoption of AI - driven solutions and product enhancements.
Senior Geotechnical Engineer providing technical leadership and developing engineering solutions for mining projects. Collaborating with teams to ensure compliance and excellence in geotechnical engineering.
Instrumentation & Controls Engineer joining Resilience Water team to lead I&C and SCADA design projects. Supporting diverse clients and projects in industrial and municipal water and wastewater treatment.
Project Water Engineer at Arcadis delivering design solutions for water, wastewater, and reuse clients in California. Evaluate, plan, and design projects while supporting management and collaborating with teams.
Operational Technology Project Engineer coordinating project deployment plans and timelines for enterprise - level technology operations. Managing teams and documentation in manufacturing environments with continuous operations.
Senior Critical Facility Engineer handling infrastructure operations and maintenance for data center clients. Leading a team to ensure 100% uptime and operational excellence in critical facility environments.
Water Resources Engineer specializing in civil, drainage, and stormwater projects for waterfront infrastructure at Moffatt & Nichol. Preparing plans and specs while coordinating with local authorities and contractors.
Trainee Smart Meter Engineer role installing gas and electric meters for Utilita Energy. Responsibilities include testing, installation, and compliance with industry regulations.
Trainee Smart Meter Engineer responsible for installing gas and electric meters at Utilita Energy. Participate in training programs to become a Dual Fuel Smart Meter Engineer while ensuring compliance with standards.