The Streaming Platform team owns one of the largest data-in-motion platforms in the world, providing real-time data analytics. Our clusters process 9.5 billion events per minute, serving New Relic's customer experience by making it easy for development teams to prioritize reliability. We focus on services built to stream customer data, primarily using Kafka, but we are expanding to other solutions in the future.
You will collaborate with this globally distributed team to develop expertise and best practices for New Relic development teams to operate highly reliable streaming services.
What you'll do:Own and improve your team processes.Use available innovation time to bring your creative ideas to life.Design systems using infrastructure-as-code practices.Participate in an on-call rotation and bake stability into everything, continually seeking automation opportunities for built-in reliability.Own, operate, and build our fully automated Kafka platform infrastructure, running on Kubernetes with over 1,800 brokers distributed across hundreds of clusters.This role requires:Background working with AWS products including Managed Streaming Kafka and Elastic Kubernetes Service. Azure and GCP are also relevant.Experience operating software in production Kubernetes environments at scale.Experience with configuration management like Terraform or Ansible.Experience writing software in Java, Go, or similar languages.3+ years proven experience managing services that use Kafka or other streaming platforms. Large scale is a plus.Bonus points if you have:You are excited, not intimidated, by high-throughput data systems.Experience using New Relic or similar solutions for service and infrastructure observability.Background working with AWS products including Managed Streaming Kafka and Elastic Kubernetes Service. Azure and GCP are also relevant.Experience with other large data infrastructures like databases, data lakes, etc.Experience with SRE and/or Operations principles.
#J-18808-Ljbffr