Your role and responsibilities
Ensure service reliability, availability, and performance through SLO/SLI-driven engineering.
Build and maintain monitoring, alerting, and observability systems.
Manage and improve CI/CD pipelines and deployment automation.
Automate infrastructure using IaC tools and reduce operational toil.
Lead incident response, root-cause analysis, and post-mortems.
Optimize system scalability, performance, and capacity planning.
Collaborate with development teams to embed reliability into design and operations.
Required education
Bachelor's Degree
Required technical and professional expertise
Minimum 2+ years of working experience in applying Site Reliability Engineering (SRE) principles
Hands-on experience with CI/CD concepts and tools such as Jenkins, Docker, Kubernetes
Strong knowledge of monitoring and observability frameworks including Prometheus and Grafana
Experience in Infrastructure as Code (IaC) using Terraform
Proficiency in basic Python programming and Shell scripting
Understanding of networking fundamentals and observability concepts
Official notification
Any question or remark? just write us a message
If you would like to discuss anything related to payment, account, licensing,
partnerships, or have pre-sales questions, you’re at the right place.