Lead Site Reliability Engineer (NM+)
cisco | 13 hours ago | Bangalore

Responsibilities:

 

  • Design, build, and optimize cloud and data infrastructure to ensure the high availability, reliability, and scalability of big-data and ML/AI systems to meet customer needs, while implementing SRE principles such as monitoring, alerting, error budgets, and fault analysis.

  • Collaborate closely with cross-functional teams, including customers, development, product management, and security teams, to create secure, scalable solutions that support ML/AI workloads and enhance operational efficiency through automation.

  • Troubleshoot complex technical problems in production environments, perform root cause analyses, and contribute to continuous improvement efforts through postmortem reviews and proactive performance optimization.

  • Lead the architectural vision and shape the team’s technical strategy and roadmap, balancing immediate needs with long-term goals, driving innovation, and influencing the technical direction.

  • Serve as a mentor and technical leader, guiding teams and fostering a culture of engineering and operational excellence by sharing your deep knowledge and experience.

  • Engage with customers and stakeholders to understand use cases and feedback, translating them into actionable insights and effectively influencing stakeholders at all levels.

  • Utilize your strong programming skills to integrate software and systems engineering, building core data platform capabilities and automation to meet enterprise customer needs and roadmap objectives.

  • Develop strategic roadmaps, processes, plans, and infrastructure to efficiently deploy new software components at an enterprise scale while enforcing engineering best practices.


Minimum Qualifications

  • Ability to design and implement scalable and well tested solutions, with focus on operational efficiency.

  • Strong hands-on cloud experience, preferably AWS.

  • Infrastructure as a Code expertise, especially Terraform and Kubernetes/EKS.

  • Experience building and managing Cloud, Big Data, and ML/AI infrastructure, including hands-on expertise with Hadoop ecosystem components and related technologies such as EMR, Airflow, Spark, PySpark, AWS SageMaker, AWS Bedrock, Gobblin, Kafka, Iceberg, ORC, MapReduce, Yarn, HDFS, Hive, and Hudi.

  • Ability to write high quality code in Python, Go, or equivalent programming languages.


Preferred Qualifications

  • Solid understanding of Unix/Linux systems, the kernel, system libraries, file systems, and client-server protocols.

  • Have experience with architecting software and infrastructure at scale with a sense of ownership and accountability.

  • Experience with observability tools including Prometheus (Alertmanager), Grafana, Thanos, CloudWatch, OpenTelemetry, and the ELK stack.

  • Certifications: CKA (Certified Kubernetes Administrator), CKAD (Certified Kubernetes Application Developer), AWS Certified DevOps Engineer, or equivalent certifications in cloud and security domains.

Official notification

⚡ Hot Jobs Trending Now

SRE
Sr. SRE Engineer
Stripe | Bangalore, India
DEV
Backend Developer
Coinbase | Remote, India
Infra
Cloud Infra Lead
Datadog | Pune, India
ML
MLOps Architect
Anthropic | Hyderabad
Data
Fivetran Data Eng.
Fivetran | Mumbai
SRE
Sr. SRE Engineer
Stripe | Bangalore, India
DEV
Backend Developer
Coinbase | Remote, India
Infra
Cloud Infra Lead
Datadog | Pune, India
ML
MLOps Architect
Anthropic | Hyderabad
Data
Fivetran Data Eng.
Fivetran | Mumbai
SDE
Staff Software Eng.
Airbnb | Gurgaon, India
Prod
Platform Engineer
Databricks | Bangalore
QA
Quality Assurance
GitLab | Remote
Security
Cloud Security
Zscaler | Mumbai
UX
Product Designer
Figma | Pune, India
SDE
Staff Software Eng.
Airbnb | Gurgaon, India
Prod
Platform Engineer
Databricks | Bangalore
QA
Quality Assurance
GitLab | Remote
Security
Cloud Security
Zscaler | Mumbai
UX
Product Designer
Figma | Pune, India
Contact US

Let's work laptop charging together

Any question or remark? just write us a message

Send a message

If you would like to discuss anything related to payment, account, licensing,
partnerships, or have pre-sales questions, you’re at the right place.