AI Platform & Systems Engineer providing operational support for GPU-based compute infrastructure at BNY. Deploying and troubleshooting containerized AI workloads and automating processes with modern tooling.
Responsibilities
Provide hands-on operational support and incident management for GPU-based compute infrastructure across hybrid and on-prem environments.
Deploy, monitor, and troubleshoot containerized AI workloads using Kubernetes, Docker, and GPU orchestration tools such as Run:AI, Volcano, or Kubeflow.
Automate infrastructure processes and workload provisioning using Python, Bash, and configuration management tools.
Maintain and scale training/inference workloads using GitOps tools like Helm, ArgoCD, and integrate with CI/CD pipelines (GitLab, Jenkins).
Requirements
Bachelor's degree in computer science or a related discipline, or equivalent work experience required; advanced degree preferred
8-10 years of related experience required; experience in the securities or financial services industry is a plus.
Experience with Linux administration (RHEL/Ubuntu), shell scripting, and system-level debugging.
Proven experience running distributed systems in Kubernetes and containerized environments using Docker.
Familiarity with GPU resource management, including NVIDIA GPU Operator and device plugin lifecycle.
Experience with CI/CD workflows and infrastructure automation tools such as GitLab CI, Jenkins, Terraform, Helm, or Ansible.
Knowledge of networking fundamentals and persistent storage systems.
Exposure to cloud platforms (AWS, GCP, Azure) and hybrid GPU environments.
Ability to read and support Python code focused on ML/AI pipeline integration.
Strong analytical and troubleshooting skills with a collaborative mindset.
Effective communication skills and proactive ownership of platform reliability and performance.
Benefits
BNY offers highly competitive compensation, benefits, and wellbeing programs rooted in a strong culture of excellence and our pay-for-performance philosophy.
We provide access to flexible global resources and tools for your life’s journey.
Focus on your health, foster your personal resilience, and reach your financial goals as a valued member of our team, along with generous paid leaves, including paid volunteer time, that can support you and your family through moments that matter.
Trading System Engineer at Synechron enhancing high - performance trading systems for fixed income markets. Focusing on design, implementation, and optimization within a fast - paced environment.
Senior Distinguished Technologist leading architecture and solution design for AI/ML Networking Infrastructure. Engaging with enterprises globally to influence tech strategy and provide technical leadership.
Staff Systems Engineer, Product Owner in Stryker's Endoscopy division translating product vision into software requirements. Collaboration with Systems Engineering, UX, R&D, and Marketing in hybrid work setting.
Mid - Level Systems Engineer delivering advanced geospatial intelligence capabilities at Leidos. Collaborating on system design and integration within the Intelligence Production Solutions Division.
Part - time Systems Architect/SME at AMERICAN SYSTEMS, supporting multi - level security systems for the Exodus Transport Network. Requires deep expertise and active Top Secret clearance.
HBM System Engineer at Micron developing innovative memory solutions for customers. Leading technical validation and testing efforts to ensure successful product launches.
Architectural leadership for next - generation DRAM products at Micron. Define DRAM system architectures and collaborate across technology domains for high - performance memory solutions.
IT Network System Engineer focusing on LAN/WLAN projects for customer locations at DATAGROUP. Responsible for network configuration, operation, troubleshooting, and support.
Design System Engineer managing PermitFlow's component library and collaborating with product designers on accessibility and quality standards. Building and maintaining components that drive product development and consistency.