You take end-to-end ownership of infrastructure, design, scale, and operate it. This goes beyond execution. Here's what that looks like day to day:
Own the design, architecture, and reliability of Locus's cloud infrastructure across AWS, Azure, GCP, and Aliyun, supporting multi-region, global deployments.
Lead the evolution of our CI/CD ecosystem, optimize and refactor our Jenkins-as-Code setup for scalability, performance, and developer efficiency.
Drive the Infrastructure as Code (IaC) journey end-to-end, migrate existing cloud resources, alarms, and configurations fully into code with strong versioning, review, and rollback practices.
Partner with engineering teams to identify and resolve performance, scalability, and reliability bottlenecks, deep dives into memory, CPU, networking, and storage constraints.
Define and implement monitoring, alerting, and incident response best practices, improve MTTR, system observability, and operational readiness.
Lead initiatives around cost optimization, security hardening, and capacity planning, keep infrastructure efficient and compliant as the platform scales.
Act as a technical mentor for junior DevOps engineers and raise the overall DevOps maturity across teams.