Site Reliability Engineer (3+)
okta | 25 days ago | Bengaluru

What you’ll be doing 

  • Designing, building, running, and monitoring Okta's production infrastructure
  • Be an evangelist for security best practices and also lead initiatives/projects to strengthen our security posture for critical infrastructure
  • Responding to production incidents and determining how we can prevent them in the future
  • Triaging and troubleshooting complex production issues to ensure reliability and performance
  • Identifying and automating manual processes
  • Continuously evolving our monitoring tools and platform
  • Promoting and applying best practices for building scalable and reliable services across engineering
  • Developing and maintaining technical documentation, runbooks, and procedures
  • Supporting a 24x7 online environment as part of an on-call rotation
  • Be a technical SME for a team that designs and builds Okta's production infrastructure, focusing on security at scale in the cloud.

What you’ll bring to the role

  • Are always willing to go the extra mile: see a problem, fix the problem.
  • Are passionate about encouraging the development of engineering peers and leading by example.
  • Have experience automating, securing, and running large-scale production Java/Tomcat and containerized services in AWS (EC2, ECS/EKS, KMS, Kinesis, RDS) or other cloud providers.
  • Experience deploying and managing Kubernetes/K8s clusters (EKS preferred). Experience with monitoring/alerting in the kubernetes eco system, and with deploying microservices
  • Have deep knowledge of CI/CD principles, Linux fundamentals, OS hardening, networking concepts, and IP protocols.
  • Have a deep understanding and familiarity with configuration management tools like Chef, Terraform, and Ansible.
  • Have expert-level abilities in operational tooling languages such as Ruby, Python, Go and shell, and use of source control.
  • Familiar with industry-standard security tools like Nessus and OSQuery.
  • Familiar with data stores such as RDS, S3, Redis, Cassandra, and Elasticsearch.

Experience in the  following 

  • 3+ years of experience architecting and running complex AWS or other cloud networking infrastructure resources
  • 3+ years of experience with Infrastructure As Code such as Terraform, Chef or Ansible;
  • 2+ years of experience with Kubernetes/ K8s;
  • Strong Linux understanding and experience;
  • Strong security background and knowledge;
  • BS In computer science (or equivalent experience).
Official notification
Contact US

Let's work laptop charging together

Any question or remark? just write us a message

Send a message

If you would like to discuss anything related to payment, account, licensing,
partnerships, or have pre-sales questions, you’re at the right place.