Site Reliability Engineer focusing on AWS cloud environments, SRE practices, and system reliability within GFT's team. Collaborating on cloud migrations and observability initiatives.
Responsibilities
Work on Site Reliability Engineering (SRE) initiatives and activities in AWS cloud environments;
Define and track Service Levels (SLAs), Service Level Indicators (SLIs) and performance metrics;
Contribute to the expansion and evolution of SRE practices across the organization;
Assess environment maturity and propose optimization strategies and process improvements;
Monitor technical and business metrics and indicators to ensure availability, resilience and performance of IT services;
Participate in modernization projects and migrations of environments to the cloud;
Design solutions and architectures focused on Observability.
Requirements
Experience with SRE practices and operating environments in AWS cloud;
Knowledge of Observability and APM tools such as Grafana, AppDynamics, Dynatrace, Prometheus, DataDog, ELK or Zabbix;
Experience in log analysis and investigation of connectivity and integration scenarios between applications and partners;
Knowledge of service monitoring and defining operational metrics;
Focus on cost optimization and service performance;
Orientation toward ensuring application reliability and security;
Experience in modernization and cloud transformation projects;
Experience designing Observability architectures;
Experience in high-availability environments and critical systems.
Benefits
Multi-benefits card – choose how and where to use it.
Scholarships for undergraduate, graduate, MBA and language courses.
Certification incentive programs.
Flexible working hours.
Competitive salaries.
Annual performance review with a structured career plan.
DevOps and Build Engineer for NVIDIA developing and maintaining CI/CD pipelines. Collaborating with teams to enhance compiler technologies and optimize build performance in a diverse environment.
Senior AWS DevOps Developer responsible for managing AWS infrastructure for enterprise public budgeting software at Euna Solutions. Collaborating on cloud projects and enhancing system reliability and performance.
Principal AI Site Reliability Engineer driving operational excellence for critical contact center applications at Fidelity. Leading automation and observability initiatives to improve reliability and efficiency.
Data Transport Infrastructure DevOps Engineer at Leidos modernizing global - scale multi - cloud environments for USAF missions. Involves developing cloud - native solutions and ensuring security best practices.
DevOps Engineer responsible for building and optimizing AWS - based infrastructure and backend systems at Allguth GmbH. Part of a team focused on innovative mobility solutions in Munich region.
(Senior) DevOps Engineer specializing in ML solutions implementation and management in Germany. Focused on CI/CD pipelines, automation, and cloud services.
Specialist DevSecOps joining Periferia IT Group, a leader in digital transformation. Work in a dynamic environment with continuous learning and professional development opportunities.
Join Zinkworks as a Senior Platform Engineer designing scalable IaC - driven cloud platforms for a large - scale enterprise contact centre. Focused on automation, reliability, and platform ownership in a hybrid work environment.
Asset Reliability Engineer providing maintenance advice and service innovations. Join Sensorfact, the leading smart monitoring platform, to modernize the industrial sector.