Senior DevOps Engineer managing monitoring systems for B2B e-commerce platforms in Azure Cloud. Collaborating with teams to improve platform products and processes.
Responsibilities
Shape processes and tools for operating complex e‑commerce environments in the Azure cloud
Support the operation of customers on our global platform
Design, implement and further develop a monitoring system for technical components as well as business processes in a DevOps environment
Support the operation of various web applications and Java applications
Continuously maintain and improve the monitoring environment: analyze recurring alerts, determine optimal thresholds, aggregate data into dashboards
Model dependencies and derive ideal alerting strategies as well as automate recurring processes in collaboration with customers and our operations team
Improve the Intershop platform product and related toolchains as part of a Platform Engineering team – for and together with other teams
Communicate with internal customers and partners to improve the monitoring strategy
Requirements
Good to very good knowledge of the Linux operating system and Kubernetes
Good knowledge of IaaS infrastructure services and strong analytical skills
Hands-on experience with cloud observability platforms such as New Relic, Datadog or other services
Experience with CI/CD tools such as Azure DevOps, GitLab, GitHub Actions
Experience with distributed version control systems (Git)
Good written and spoken English skills
Experience operating Java applications and/or experience in configuration management and infrastructure automation (e.g. Terraform) is a plus
Experience with OpenTelemetry and Jaeger tracing as well as scripting languages is a plus
Benefits
Flexible working hours and home office option
30 days of vacation
Subsidy for public transport ticket, company bike (Jobrad) or free employee parking
2 major company parties per year + monthly leisure events
Weekly yoga class and other relaxation options with company gym, table soccer, table tennis and darts as well as our sports groups
Free coffee, tea, cocoa, regional apple juice and mineral water dispenser as well as fresh fruit
Personal and professional development through internal and external training & coaching
DevOps Engineer automating and configuring network monitoring and automation solutions for Telia’s telecom operations in Finland. Ensuring performance, resilience, and high observability of critical platforms.
Client Services Consultant specializing in DevOps Mainframe Operations with experience in automation best practices. Analyzing Life Cycle Management data needs and evaluating solutions for Endevor - related operations.
Senior AWS DevOps Engineer at LexisNexis shaping global CI/CD platform. Collaborating with teams to deliver secure, reliable, and scalable delivery pipelines.
Cloud Engineer at MetroStar focusing on building and securing cloud - native systems. Managing Kubernetes workloads and CI/CD pipelines in Agile teams with an emphasis on security.
Senior Engineer Cloud Engineering role focused on AWS migration and automation. Collaborating with teams to innovate cloud patterns and infrastructure best practices.
Senior Operations Engineer driving efficiency and reliability in NVIDIA's global business operations. Collaborating with IT subsystems and automating operational workflows for organizational impact.
Lead or Senior DevOps Developer joining Boeing Defense, Space and Security for advanced technology missions. Involves CI/CD, cloud systems design, and collaboration with government customers.
Site Reliability Engineer ensuring high availability and performance for digital platforms in retail. Collaborating with engineering teams for automation and observability practices.
Associate Site Reliability Engineer supporting the reliability and performance of global IT infrastructure at Exegy. Engage with senior engineers and learn foundational systems engineering skills.
Site Reliability Engineer driving innovation and growth for Banking Solutions, Payments, and Capital Markets business. Responsible for application reliability and incident response in a hybrid work environment.