Senior Engineer for the Serverless Cell Platform at CrowdStrike, a leader in cybersecurity. Monitoring large-scale distributed systems to ensure high performance and reliability.
Responsibilities
Monitor and maintain the health, performance, and reliability of our hyperscale cell infrastructure processing trillions of events daily
Lead incident response and problem management through established on-call rotations and structured feedback loops
Implement comprehensive monitoring with Service Level Indicators to enable proactive alerting and automated self-healing
Conduct capacity planning and forecasting based on ingest rates and query patterns to optimize resource utilization
Ensure data integrity and compliance across >100 PB of stored data through automated consistency checks and recovery testing
Manage access controls, certificate rotation, and vulnerability management across cell infrastructure according to defined SLAs
Provision and scale cell infrastructure (vertical/horizontal) based on demand and performance requirements
Develop microservices and automation tools for cell components, including ingest writers and management systems
Orchestrate version upgrades, patch management, and configuration changes with minimal customer impact
Perform load testing and performance benchmarking to validate scaling thresholds and optimize costs
Coordinate with fleet operations, product teams, and infrastructure teams on global changes and capacity planning
Create technical documentation, operational playbooks, and partner with teams to address customer-impacting issues
Work in a team of friendly, trustworthy, and knowledgeable colleagues
Build and maintain CI/CD pipelines for testing and releasing configuration and software
Troubleshoot complex issues across multiple large-scale distributed systems, including LogScale, Kafka, object storage systems, and related infrastructure
Work closely with Engineering and Customer Support to troubleshoot time-sensitive production issues, regardless of when they happen
Apply SRE best practices, including SLOs, error budgets, chaos engineering, and blameless post-mortems
Effectively utilize AI coding assistants (e.g., Anthropic Claude) to accelerate development and problem-solving.
Requirements
Proven experience designing and implementing distributed systems with high scalability, availability, and performance optimization at enterprise scale
Experience in contributing to broad technical leadership in products or services
A can-do attitude; you thrive collaborating in a team and are not afraid of taking on responsibilities
Several years' experience with large-scale, business-critical Linux-based environments
Solid grounding in the technology of at least one cloud environment (AWS, Azure, GCP)
Experience working with CI/CD, Jenkins Git, Artifactory, Bitbucket
Go (golang) programming experience in production environments
Some familiarity with Python programming
Experience with configuration management systems such as Chef or Ansible
Availability for on-call on a rotational basis
Bonus Points: Experience with Kafka
Bachelor's degree in an applicable field, such as Computer Science or Engineering.
Benefits
Market leader in compensation and equity awards
Comprehensive physical and mental wellness programs
Competitive vacation and holidays for recharge
Paid parental and adoption leaves
Professional development opportunities for all employees regardless of level or role
Employee Networks, geographic neighborhood groups, and volunteer opportunities to build connections
Backend Software Engineer for AI Core team building foundational AI capabilities across Ad Platforms. Implementing backend services, APIs, and integrating LLMs for AI - powered systems.
Senior Software Engineer developing and maintaining web - based JavaScript player for Disney+, Hulu, and ESPN+. Collaborating with teams to deliver a seamless video streaming experience across platforms.
Software Engineer II overseeing data pipelines and implementing machine learning projects for Disney's media technology. Collaborating with cross - functional teams in enhancing platform performance.
Full Stack Developer focusing on backend systems with Java at NorthStar Systems. Building scalable platforms with Kafka, Docker & Kubernetes, while collaborating with global partners.
Staff Software Engineer at Medical Home Network integrating AI into software engineering practices. Collaborating with product, design, and DevOps to drive innovation and technical direction.
Technical Lead responsible for platform maintenance and support at The White Team Consulting. Requires extensive experience in IBM and messaging systems within a hybrid work environment.
Software Developer in Agile environment for professional services. Responsible for coding, integration, troubleshooting, and documentation in a hybrid setup.
Middle Software Developer in CPQ Capability customizing SAP solutions in a hybrid setting. Involves coding, integrations, and technical support for clients' pricing needs.
Senior Software Developer specializing in SAP CPQ solutions for client customization and configuration. Leading project delivery teams and ensuring performance and reliability in complex environments.
Software Engineer working with AI and finance to build products at Fiscal.ai. Focused on full stack development and API infrastructure in a remote setting with occasional office work.