Site Reliability Engineer working on Linux systems for observability platforms and logging. Design and maintain applications, support network visibility, and collaborate with teams.
Responsibilities
Design, build, and maintain logging platforms.
Design, build, and maintain metrics and tracing platforms.
Design, build, and maintain front end visualization tools.
Design, build, and maintain supporting applications used by the logging/metrics platforms.
Support and maintain the network capture visibility platforms.
Support and maintain network packet brokers.
Work with other teams in support of above systems.
Research, create documentation, and provide examples for end users on how to use the above systems.
Requirements
Experience with supporting and troubleshooting server hardware.
Experience supporting Linux
Knowledge of server hardware
Knowledge of TCP/IP and the OSI Model for support of the network capture and visibility platforms.
Experience with k8s and/or containerization platforms.
Familiarity with the GitOps method of system and platform management
Usage of automation and automation tools.
Specific knowledge as it pertains to network monitoring and packet capture with Linux systems
Senior Operations Engineer driving efficiency and reliability in NVIDIA's global business operations. Collaborating with IT subsystems and automating operational workflows for organizational impact.
Lead or Senior DevOps Developer joining Boeing Defense, Space and Security for advanced technology missions. Involves CI/CD, cloud systems design, and collaboration with government customers.
Site Reliability Engineer ensuring high availability and performance for digital platforms in retail. Collaborating with engineering teams for automation and observability practices.
Associate Site Reliability Engineer supporting the reliability and performance of global IT infrastructure at Exegy. Engage with senior engineers and learn foundational systems engineering skills.
Site Reliability Engineer driving innovation and growth for Banking Solutions, Payments, and Capital Markets business. Responsible for application reliability and incident response in a hybrid work environment.
DevSecOps role at Tiime ensuring implementation of security practices in products. Collaborate with teams for cloud security and incident management in a hybrid workspace.
Senior Site Reliability Engineer responsible for designing reliable infrastructure supporting Fixify's SaaS platform. Collaborating with product engineering teams and maintaining operational standards for infrastructure performance.
DevOps Engineer working with critical infrastructure systems for Swedish internet services. Focused on building and managing robust systems and contributing to automation and operational improvements.
DevSecOps Consultant integrating security into IT development and operational processes. Advising clients on seamless integration of security requirements into DevOps workflows.
DevOps Engineer designing, developing and supporting programs at Swift, the leading provider of secure financial messaging services. Involves system analysis, program development and team collaboration.