Role: Prometheus & Grafana Expert with Observability Experience (m/f/d)
For one of our clients in Spain, we are looking for a Prometheus & Grafana Expert with Observability Experience (m/f/d).
The primary objective of this role is to design, implement, and maintain a robust observability infrastructure using Prometheus and Grafana, with complementary capabilities provided by Elastic. The expert will work closely with the DevOps, IT, and development teams to ensure comprehensive monitoring, alerting, and visualization of the systems.
Project Conditions:
Start: asap
Duration: 6 month (+)
Workload: 100%
Place: remote
Your responsibilities:
Assess current monitoring and observability setup and identify gaps
Design and implement Prometheus-based monitoring solutions in on-premises setup with multitenant and several support teams design.
Configure and maintain Grafana dashboards for real-time visualization with multi-tenant and
several support teams design.
Integrate Prometheus with other systems and tools (e.g., Loki, Mimir, Tempo, Thanos).
Design and implement logging solutions using Elastic (ELK Stack) for on-premises setups.
Develop and document monitoring and logging strategies and best practices.
Set up alerts and notification mechanisms to preemptively address system issues.
Train internal staff on the use and maintenance of Prometheus, Grafana, and Elastic
We are looking for:
Proven experience in deploying and managing Prometheus and Grafana in on-premises setup with
multi-tenant and several support teams design
Strong understanding of observability concepts and best practices.
Experience with related technologies (e.g., Kubernetes, Docker, Mimir, Loki, Tempo, Thanos, on premises infrastructure)
Proficiency in scripting and automation (e.g., Bash, Python).
Familiarity with infrastructure-as-code tools (e.g., Ansible, Terraform).
Experience with log management and tracing solutions (e.g., Loki, ELK stack, Jaeger)
Excellent problem-solving and analytical skills
Ability to work independently and as part of a team
Strong English communication skills for effective collaboration and training
#J-18808-Ljbffr