Hiring near our Barcelona, Spain Center of Excellence. Gartner offers a hybrid, flexible environment, with remote work that allows associates great flexibility to work from home, and opportunities to connect with colleagues for moments that matter on-site. Candidates that apply should be located within a reasonable proximity to one of Gartner's Centers of Excellence office locations.
About Gartner IT: Join a world-class team of skilled engineers who build creative digital solutions to support our colleagues and clients. We make a broad organizational impact by delivering cutting-edge technology solutions that power Gartner. Gartner IT values its culture of nonstop innovation, an outcome-driven approach to success, and the notion that great ideas can come from anyone on the team.
About this role: We are seeking a Principal Network Engineer (Network Performance) who will play a crucial role in supporting the production and operations of our conferences IT platforms and our enterprise network (both cloud & on-prem). During live conferences, the candidate will work closely with network operations team members to ensure smooth operation of the network infrastructure by optimizing and maintaining its high performance and reliability. Additionally, during non-conference periods, the candidate will focus on ensuring the operational readiness of the network infrastructure. This includes Observability, performance and resiliency utilizing Chaos Engineering techniques.
What you'll do: As part of SRE scrum team, troubleshoot and resolve complex network performance and reliability issues, working closely with network operations and engineering teams.Function as SME in utilizing NPM tools to drive forensics on network performance and reliability issues that impact optimal end user experience and/or application health.Work closely with the Conference Network team and also Enterprise Network team to maintain/enhance a comprehensive knowledge of the systems and infrastructure.Work closely with the Observability team to maintain/improve dashboard/alerting posture.Monitor operational dashboards and alerts during conferences and respond to alerts.Collaborate to develop/design chaos test cases that effectively simulate real-world scenarios, identify potential vulnerabilities and areas for improvement.Execute chaos tests, analyze using NPM, APM and other monitoring tools to identify performance and stability issues.Utilize breadth of knowledge and experience to accurately connect the dots between application and network performance issues.Utilize strong network forensics knowledge to cross train other IT engineers.Use data driven analysis to drive continuous improvement in network observability, performance, reliability and resilience.Perform analytics on previous incidents to understand root causes and use automation to detect problems faster, reduce the probability and/or impact of problem recurrence where possible.
#J-18808-Ljbffr