Site Reliability Engineer at Tecsys Inc. | Hybrid Hired

About the role

Site Reliability Engineer maintaining cloud infrastructure reliability for Tecsys solutions. Collaborating across teams to support services and implement automation, observability, and frameworks.

Responsibilities

Collaborate with other Engineering teams to support services before they go live through activities such as system design consulting, developing software platforms and frameworks, capacity planning and launch reviews.
Innovate relentlessly: Identify pain points, propose creative solutions, and drive initiatives that simplify, scale, and strengthen the platform.
Maintain services once they are live by measuring and monitoring availability, latency and overall system health.
Own observability: Enhance and expand monitoring and alerting using Datadog; define SLOs/SLIs and create actionable dashboards that drive reliability outcomes.
Drive automation: Develop and improve internal tooling, IaC frameworks, and pipelines (Terraform, GitLab CI/CD) to reduce manual intervention and enable self-healing systems.
Scale systems sustainably through automation and evolve systems by pushing for changes that improve reliability and velocity.
Act as an agent orchestrator using Amazon Kiro: run multiple activities in parallel by leveraging AI agents to accelerate execution, while personally validating results and completing selected tasks manually when needed.
Be on-call.
Practice sustainable incident response and blameless postmortems. Lead post-incident reviews (RCAs) and identify long-term fixes that improve stability, reliability, and developer experience.
Implement monitoring, Logging, alerting, and SLA Reporting.
Create and maintain technical documentation.
Implement, maintain and mature SRE best practices.
Lead incidents: Act as Incident Commander for Incidents; coordinate cross-team response, manage communications, and ensure rapid service restoration.
Provide support for our planning and deployment teams to enable stability, predictability, and scale in our continued growth.
Collaborate with members of the Platform Engineering team to implement and support far-reaching strategic efforts, provide constructive feedback, and foster a collaborative environment.
Work cross-functionally with internal teams and vendors to manage our growth around the globe, with a strong focus on maintaining the high level of performance, availability, and reliability for our users.

Requirements

5+ years in Site Reliability, Cloud, or DevOps Engineering, ideally in SaaS or large-scale production environments.
Experience designing and deploying large scale systems, multi-vendor platforms and globally distributed infrastructure.
Proven experience managing cloud infrastructure in AWS (multi-account, VPC, EC2, EKS) and Kubernetes at scale.
Strong hands-on experience with IaC and automation (Terraform, Ansible, or similar).
Familiarity with CI/CD pipelines and release automation (GitLab preferred, Jenkins acceptable).
Deep understanding of monitoring and observability using Datadog (or equivalent), including metric design, log pipelines, alerting, and dashboards.
Experience with incident management, on-call participation, escalation, and structured postmortems.
Scripting skills in Python, Bash, Java or equivalent for automation and diagnostics.
Curiosity, ownership, and a bias for action; you see a problem, you solve it, and you share the lessons learned.
Experience with Fedramp (The Federal Risk and Authorization Management Program) compliance is a strong asset.
Basic knowledge of Java- or .Net-based development required.
Strong English communication skills, both written and spoken, are essential for effective correspondence with customers, business partners and colleagues beyond the province of Quebec.
Escalation on-call rotation
Occasional travel (quarterly offsites, conferences – less than 10%)

Similar roles

Browse all Devops Engineer jobs

4 days ago

NG

Engineer Software – DevSecOps/DevOps

Northrop Grumman

Software Engineer - DevSecOps designing modern software systems for aerospace programs at Northrop Grumman. Collaborating with multi - disciplinary teams in an Agile environment to implement DevSecOps lifecycle.

Onsite Role

San Diego United States Devops Engineer

$79,300 - $137,600 per year

4 days ago

NG

Principal Software Engineer – DevSecOps, DevOps

Northrop Grumman

Principal Software Engineer focused on DevSecOps software factory at Northrop Grumman. Working with multi - disciplinary teams to implement DevSecOps practices for aerospace programs across various locations.

Hybrid Role

San Diego United States Devops Engineer

$98,400 - $171,000 per year

4 days ago

AL

Senior Systems Engineer – DevOps

Arch Capital Group Ltd.

Sr. Systems Engineer implementing and optimizing CI/CD platforms at Arch Capital Group. Collaborating with teams and driving DevOps strategy with expertise in cloud technologies.

Hybrid Role

Raleigh United States Devops Engineer

$120,000 - $175,000 per year

4 days ago

BO

Java Full Stack Developer, AWS (Mid-level/Senior)

Boeing

Java Full Stack and AWS DevOps Developer for Boeing's Manufacturing Quality Information Technology Team, maintaining and enhancing software systems and DevOps environments while ensuring compliance.

Hybrid Role

Hazelwood United States Devops Engineer

$107,950 - $156,400 per year

4 days ago

RH

Senior DevOps Engineer

Rally Health

Senior DevOps Engineer at One Pass redefining health engagement, managing scalable cloud infrastructure and enhancing automation. Collaborate across teams to ensure system reliability and performance.

Hybrid Role

United States Devops Engineer

$145,000 - $197,000 per year

4 days ago

RH

DevOps Engineer

Rally Health

DevOps Engineer at One Pass building and improving cloud infrastructure in AWS. Collaborating with engineers on deployments, reliability, and automation in a fast - paced environment.

Hybrid Role

United States Devops Engineer

$124,000 - $168,000 per year

4 days ago

KA

Senior Release Engineer, CI/CD

Kaseware

Senior Release Engineer designing CI/CD pipelines for Kaseware’s mission - critical software. Collaborating with engineering, security, and operations teams to ensure fast and reliable deployments.

Hybrid Role

Denver United States Devops Engineer

$150,000 - $185,000 per year

4 days ago

CO

Senior DevOps Engineer – KI-Startup

Codefy

DevOps Engineer managing Kubernetes and cloud infrastructure for innovative legal software startup. Collaborating with development teams and ensuring smooth deployment processes.

Hybrid Role

Heidelberg Germany Devops Engineer

€55,000 - €80,000 per year

4 days ago

AG

Dev Ops Architect

AgencyBloc

DevOps Architect defining and evolving AgencyBloc’s cloud and DevOps strategy. Leading design of infrastructure and CI/CD frameworks for secure and scalable SaaS platforms.

Onsite Role

Cedar Falls United States Devops Engineer

4 days ago

VG

DevOps Engineer, SRE-focused

VERBI Software GmbH

DevOps Engineer at VERBI Software GmbH managing AWS - centric infrastructure and driving reliability, scalability, and modernization. Hands - on role applying SRE principles to evolve towards cloud - native best practices.

Hybrid Role

Berlin Germany Devops Engineer