Observability Analyst
RTM, the integrator hub of the financial market, seeks to support its clients and partners in the architecture of multi-cloud solutions, across different technologies that underpin the Brazilian financial system.
We are looking for someone to join the Cloud Solutions team and actively participate, together with the business area, in client engagement, assisting them in finding the best components and architectures for their various services.
We expect you to be a curious, self-taught individual who seeks information and knowledge and shares it with the team.
Responsibilities:
- Implement and manage observability solutions in hybrid cloud environments (public cloud: AWS, Azure, GCP) and private cloud using technologies such as OpenStack, VMware, and OpenShift.
- Monitor and optimize the performance of applications, network infrastructure, and cloud resources using monitoring tools such as Prometheus, Grafana, ELK Stack (Elasticsearch, Logstash, Kibana), and Nagios.
- Define and configure dashboards and alerts to ensure the health and availability of public and private cloud platforms.
- Track and generate reports on the performance of virtualized infrastructure environments (VMware, OpenStack), as well as containers and orchestration (OpenShift).
- Work collaboratively with infrastructure, DevOps, and security teams to resolve performance, latency, and system failure issues.
- Automate metrics and log collection processes to ensure a unified view of the IT environment.
- Conduct incident analysis, root cause analysis (RCA), and support the definition of action plans for problem mitigation and continuous improvement.
- Document processes, monitoring configurations, and changes made, while ensuring the consistency and scalability of observability tools.
Essential requirements:
- Solid experience monitoring hybrid environments, including public cloud (AWS, Azure, GCP) and private cloud with OpenStack, VMware, and OpenShift.
- Advanced knowledge of observability and monitoring tools such as Prometheus, Grafana, ELK Stack (Elasticsearch, Logstash, Kibana), Nagios, Zabbix, or similar.
- Experience monitoring containers and container orchestration, primarily with OpenShift and Kubernetes.
- Ability to analyze logs and metrics to identify issues and optimize system and application performance.
- Experience with automation and infrastructure-as-code (IaC) tools for integrating observability solutions.
- Good understanding of cloud-native architecture and microservices.
- Problem-solving skills for complex issues and teamwork for quick and efficient solutions.
- Technical English for reading and interpreting documentation and reports.
Differentiators:
- Certifications in AWS, Azure, OpenStack, VMware, Kubernetes, or OpenShift.
- Experience with monitoring automation using tools such as Ansible, Terraform, or ArgoCD.
- Knowledge of observability for serverless environments and technologies such as AWS Lambda or Azure Functions.
- Knowledge of advanced alerting systems such as Alertmanager, PagerDuty, or Opsgenie.
RTM offers:
- CLT employment contract (Brazilian labor law).
- Hybrid work model (3 days in-office and 2 days remote).
- Transportation voucher.
- Direct bonuses for projects you participate in.
- Reimbursement for courses and certifications in the field.
- Bradesco Health and Dental Plan.
- Corporate Life Insurance.
- Total Pass (wellness program).
- Fully company-sponsored access to Alura.
- Annual bonus.
- No dress code.
- Private Pension Plan.
- Partnership and Discount App.
- Food/Meal voucher: R$ 1,856.00 per month.
- Birthday off.
- Home Office allowance.
- Orienteme (app for consultations with psychologists, nutritionists, and physical educators).
Work location: Downtown São Paulo.