IntroThe Site Reliability Engineer Lead (SRE Lead) at Screening Eagle will lead a team of SREs to ensure the stability, resilience, and scalability of our services through automation, testing, and engineering.
This role involves leveraging expertise from product systems/operations, cloud infrastructure (AWS), build and release engineering, software development, and stress/load testing to guarantee our services are available, cost-efficient, and fit for purpose from the early stages of development.
5+ years of experience developing AWS cloud infrastructure and 7+ years of experience leading teams.What will you doCloud Infrastructure Management and NetworkingDesign, develop, and implement cloud infrastructure using Terraform.Optimize resources for cost-efficiency and performance.Ensure infrastructure security and implement service control policies (e.G., Control Tower).Configure AWS VPC flow logs, load balancer logging, Direct Connect, AWS VPN, TGX, etc.Monitoring, Support, and PrototypingImplement robust monitoring and alerting systems.Set up and monitor CI/CD pipelines both on-premises and in the cloud.Enhance monitoring, logging, and alerting practices.Use tagging and cost categorization for cost analysis.Create prototypes and lead development teams in implementing solutions.Team Leadership, Collaboration, and DocumentationLead the SRE team, ensuring technical quality and best practices.Guide the team through the software development lifecycle.Collaborate with developers and operations to integrate infrastructure changes.Document DevOps changes, technical partnerships, design, integration, testing, and deployment.Innovation, Quality Assurance, and Process ImprovementEvaluate risks, customize applications, and lead quality practices.Focus on agile methodologies, test automation, and continuous integration.Simplify and automate complex processes to ensure quality and operational excellence.Improve the DevOps toolchain and streamline software delivery processes.Stop projects/products if solutions are not technically acceptable.What do we expectExtensive experience in implementing and evolving DevOps practices across multi-disciplinary teams and business frameworks.Strong background in leading technology change programs and managing projects.In-depth knowledge and experience with AWS services (EC2, S3, VPC, IAM, etc.
).Expert-level proficiency in Terraform, including writing reusable modules and leveraging best practices.Highly skilled with Kubernetes, Terraform, serverless and AWS in general.Proficient in non-functional testing, including performance, security, and cost optimization.Experience working with advanced architectures such as ARM and AWS Graviton, optimizing for performance, cost-efficiency, and scalability.Knowledge of K8S operator programming and those related with GPU based architecturesCompetent in using different arch build tools and practices.Expertise in Git and GitOps philosophy.Expert in logging and monitoring tools like ELK, Prometheus, and Grafana.Demonstrable MLOps experience.Ability to quickly gain domain knowledge.Operational experience in maintaining applications.Apply for this jobAbout usWe are on a mission to protect the built world with software, sensors and data.
We hire talented problem-solvers with bold ambition who share our passion for inspection technology to sustain mission-critical assets and infrastructure for future generations.
Our culture is creative, innovative and inclusive.
We are a fast-paced, product driven, growth company headquartered in Switzerland with our Singapore and Malaga technology hub and a global mindset looking to lead a digital revolution in inspection.
Want to join the #EagleTeam?#J-18808-Ljbffr