Senior Platform Engineer contributing to observability for Kraken's energy management platform. Seeking individual in Tokyo or remote within Japan for a transformative role in the energy sector.
Responsibilities
Support and implement monitoring and alerting strategy across Kraken’s customer business
Define and uphold observability best practices across multiple products and platforms
Partner with product teams to implement observability tooling and improve reliability across the organisation
Help product teams build best-in-class dashboards for their requirements or bespoke use cases
Work with product teams to define and implement meaningful Service Level Objectives (SLOs) and Service Level Indicators (SLIs), aligned to contractual Service Level Agreements (SLAs)
Build, tune, and continuously improve alerts and monitors using golden signals (latency, traffic, errors, saturation) as a framework - reducing noise and increasing actionable signal
Help product teams transition to on-call models by improving signals, alert quality, and operational readiness
Improve tooling and self-service capabilities for alerting and monitoring across multiple product teams
Analyse incident metrics to identify trends and improvement opportunities, communicating insights clearly back to product teams
Manage the cost and usage of our observability tooling stack in collaboration with FinOps
Contribute to broader platform reliability infrastructure improvements where needed
Help solve interesting and difficult problems - there’s a significant opportunity for disruption in the global energy market
Requirements
Solid hands-on experience across our core platform stack:
AWS (supporting and improving cloud infrastructure used by product teams)
Terraform (infrastructure as code; comfortable operating with Terraform day-to-day)
Kubernetes (container orchestration and deployment management; comfortable working with Kubernetes day-to-day)
Experience using industry-standard observability tooling - we use Datadog, Grafana, Prometheus and Rootly (experience with other monitoring/alerting platforms is transferable)
Strong collaboration and communication skills - able to work effectively with developers, product managers, and other stakeholders to design and deliver impactful observability "golden paths" and monitoring experiences
Exposure to Python (or a similar C-based language like TypeScript, Go, C#) - able to understand how applications behave in production to support observability and reliability improvements
Previous experience working in small, highly autonomous teams
Comfortable with ambiguity and able to create structure in unclear situations
Proactive learning mindset (experiment, iterate, and adapt as the team evolves approaches)
Strong asynchronous written communication (Slack/Notion/docs) and a habit of keeping others in the loop
Autonomy and accountability - making progress independently and owning outcomes
Senior iOS Engineer developing user - facing applications at Qlose with global impact. Collaborating in a fast - paced developing environment with strong technical ownership expectations.
Senior Platform Engineer responsible for building and modernizing toolchains for engineers at Capital Group. Leading hands - on adoption of AI - assisted software development practices.
Experienced Data Platform Engineer at Boeing delivering cloud solutions. Involved in cloud engineering, data modeling, CI/CD pipelines, and collaboration with multidisciplinary teams.
Tech Platform Engineer supporting application and integration of vehicle electrical architecture for Ford. Driving innovation in the vehicle electrical domain and ensuring system quality.
Robotics Platform Security Engineer leading security architecture for autonomous systems. Designing secure boot processes and engineering hardened environments in robotics team.
Senior Infrastructure / Platform Engineering Manager leading design and delivery of hybrid infrastructure. Overseeing global production services with a focus on scalability and automation.
Senior Platform Engineer focusing on reliability initiatives across energy technology systems. Working with cross - functional teams to improve performance, availability, and incident management in Kraken's platform infrastructure.
Senior Platform Engineer responsible for innovating and enhancing RecordPoint's data management SaaS platform. Collaborating with cross - functional teams to deliver high - quality software while providing mentorship to junior engineers.
Platform Engineer focusing on Azure and Terraform for a global transformation partner. Collaborating in teams to solve complex technical problems and create high - quality solutions.
Staff Platform Engineer joining URBN to develop AI - powered digital experiences and integrate algorithmic solutions. Collaborating with cross - functional teams to deliver impactful products.