Principal Site Reliability Engineer (NM+)
lilly | 2 days ago | Hyderabad

What You’ll Be Doing 

  • Lead the SRE team responsible for the reliability and performance of applications deployed on a cloud-native internal platform. 

  • Design, implement, and maintain automation frameworks, self-service tooling, and auto-healing systems to eliminate manual toil. 

  • Build and enhance end-to-end observability, monitoring, logging, and alerting systems for proactive issue detection and resolution. 

  • Ensure Uptime: Take ultimate ownership of our production environment's stability. Lead end-to-end incident management, from escalation to Root Cause Analysis (RCA). Manage patching, upgrades, and disaster recovery processes. 

  • Champion Infrastructure as Code (IaC) and CI/CD best practices to ensure consistent, repeatable, and secure deployments. 

  • Collaborate with development and product teams to embed reliability and scalability into application design and architecture. 

  • Continuously evaluate and introduce emerging tools and technologies to keep the SRE stack modern and efficient. 

  • Mentor and guide SRE engineers, fostering a culture of ownership, innovation, and continuous improvement. 

  • Implement AIOps frameworks to improve operational tasks and enhance system self-healing capabilities. 

  • Participate in and optimise the on-call rotation, striving to minimise human intervention through automation. 

  • Drive capacity planning, disaster recovery, and business continuity initiatives. 

  • Support onboarding, documentation, and knowledge sharing for platform services and operational best practices. 

 

Official notification

⚡ Hot Jobs Trending Now

SRE
Sr. SRE Engineer
Stripe | Bangalore, India
DEV
Backend Developer
Coinbase | Remote, India
Infra
Cloud Infra Lead
Datadog | Pune, India
ML
MLOps Architect
Anthropic | Hyderabad
Data
Fivetran Data Eng.
Fivetran | Mumbai
SRE
Sr. SRE Engineer
Stripe | Bangalore, India
DEV
Backend Developer
Coinbase | Remote, India
Infra
Cloud Infra Lead
Datadog | Pune, India
ML
MLOps Architect
Anthropic | Hyderabad
Data
Fivetran Data Eng.
Fivetran | Mumbai
SDE
Staff Software Eng.
Airbnb | Gurgaon, India
Prod
Platform Engineer
Databricks | Bangalore
QA
Quality Assurance
GitLab | Remote
Security
Cloud Security
Zscaler | Mumbai
UX
Product Designer
Figma | Pune, India
SDE
Staff Software Eng.
Airbnb | Gurgaon, India
Prod
Platform Engineer
Databricks | Bangalore
QA
Quality Assurance
GitLab | Remote
Security
Cloud Security
Zscaler | Mumbai
UX
Product Designer
Figma | Pune, India
Contact US

Let's work laptop charging together

Any question or remark? just write us a message

Send a message

If you would like to discuss anything related to payment, account, licensing,
partnerships, or have pre-sales questions, you’re at the right place.