Senior Site Reliability Engineer

Detalles de la oferta

.Roche fosters diversity, equity and inclusion, representing the communities we serve.
When dealing with healthcare on a global scale, diversity is an essential ingredient to success.
We believe that inclusion is key to understanding people's varied healthcare needs.
Together, we embrace individuality and share a passion for exceptional care.
Join Roche, where every voice matters.
The Position The role requires the candidate to be available for on-call duty service, responding promptly to urgent issues and emergencies outside of regular working hours, ensuring that critical situations are addressed in a timely and effective manner.
Your Mission Design and maintain cutting-edge tools, scripts, and frameworks that automate repetitive tasks, streamline software deployment, and manage expansive systems with unparalleled efficiency.
Partner closely with forward-thinking development teams to architect and implement high-performance solutions that elevate system efficiency, optimize resource utilization, and enhance deployment processes for superior uptime and user satisfaction.
Your Core Responsibilities Reliability Mastery: Proactively monitor and maintain system reliability using advanced tools like DataDog, VictorOps, ELK, Grafana, and Prometheus.
Become a key player in ensuring system stability and performance.
Uptime Guardian: Ensure optimal uptime and performance by swiftly identifying issues and responding to alerts with precision.
Technical Troubleshooter: Basic understanding of architecture and designs to deep dive into complex technical issues, troubleshoot, investigate, and resolve them.
Collaborate seamlessly with engineering teams to enable timely and effective resolutions.
Service Excellence: Maintain and consistently achieve defined SLAs, SLIs, and SLOs, ensuring service levels are consistently met or exceeded.
Automation Innovator: Develop and deploy automation scripts (using Python or other scripting languages) to streamline operations, enhance system efficiencies, and reduce manual tasks.
Cloud Steward: Manage and maintain robust infrastructure across AWS and Azure environments, implementing best practices to ensure peak performance and reliability of cloud-based applications.
Cross-functional Collaborator: Work closely with engineering, DevOps, security, and operations teams to drive continuous improvement and foster a culture of reliability and inclusion.
Incident Responder: Handle requests and incidents through JIRA and ServiceNow, documenting troubleshooting procedures, solutions, and lessons learned to fuel ongoing improvements.
Flexible Scheduling: Work on-call outside of normal working hours and weekends as scheduled to ensure continuous support.
Team Builder: Actively contribute to the growth and development of the SRE team's capabilities, nurturing a stronger, more inclusive, and resilient team


Salario Nominal: A convenir

Fuente: Jobtome_Ppc

Requisitos

Integration & Technical Infrastructure Lead (Itim) - Zaragoza

.Job Location: Zaragoza regionMicrosoft Cloud Operations and Innovation (CO&I) is the team behind the cloud. Within CO&I, the Engineering Procurement and Con...


Microsoft - Cantabria

Publicado 2 days ago

Datacenter Integration Sourcing Manager

.Microsoft Cloud Operations and Innovation (CO+I) is the team behind the cloud. We are responsible for delivering over 200 Microsoft web portals, Live and On...


Microsoft - Cantabria

Publicado 2 days ago

Data Center Construction Site Director - Zaragoza

.Microsoft Cloud Operations and Innovation (CO&I) is the team behind the cloud. Within CO&I, the Engineering, Procurement and Construction (EPC) team is resp...


Microsoft - Cantabria

Publicado 2 days ago

Senior Back-End Developer, 100% En Remoto

.Senior Back-End Developer, 100% En remoto Cantabria We are seeking a skilled Senior Back-End Developer with a keen interest in Generative AI solutions and a...


Jordan Martorell S.L. - Cantabria

Publicado 2 days ago

Built at: 2025-01-23T21:29:27.308Z