Responsibilities:
Integrate, deploy and maintain key Data Engineering and ML workflows using Databricks and AWS to ensure seamless data movement from raw source to final consumption. Manage the end to end lifecycle DevOps, DataOps and ModelOps. Involve in troubleshooting and resolving integration and data-related issues.
- Set up Databricks Repos to integrate with Git and sync notebooks and source code with Databricks workspaces. Use features in Databricks for Git integration and version control.
- Create and manage CICD pipeline for smooth deployment of codes and workflows between development to stage to production environment.
- Create and manage clusters by setting up access policies as needed by data engineers and data scientists.
- Use Databricks API’s to automate and create reports on Feature Store and MLflow usage and behaviour.
- Create and manage access controls for raw and feature tables stored in Delta tables.
- Enable integration between Databricks and AWS S3 buckets by setting up right IAM policies. Set-up optimised S3 Lifecycles for data retention and storage.
- Enable monitoring of Data Engineering job workflows and automate job failures notification and fallback solutions. Optimise and suggest best practices for usage of clusters and data storage.
- Enable ingestion of streaming data into delta live tables to support realtime time based Anomaly detection models.
- Use Databricks MLflow to track model development, registry and deployment and save model artifacts like code snapshots, model parameters, metrics and other metadata.
- Use Unity Catalog to manage data and model versioning, governance and deployment. Build model drift pipelines and monitor model performance over time and enable workflows to retrain drifted models.
- Build process to automate model movement from Dev to Stage to Production and enabling A/B testing of different model versions.
- Experience in containerization technologies and orchestration frameworks , understanding of microservices architecture and deployment patterns.
- Creating Kubernetes cluster and deploying application on top of it via package manager tools like helm.
- Implement security best practices and ensure compliance with industry standards.
- Collaborate with development teams to optimize application performance and scalability.
Juniper Business Use Only
- Stay updated with emerging technologies and industry trends, and evaluate their potential impact on our infrastructure and development processes.
Qualification and Desired Experiences:
- SRE experience in building CICD pipelines, managing and owning code movements between environments, creating access controls to objects for users based on environments and roles.
- 3+ years of relevant MLOps experience in Databricks : involving GIT integration,
- setting up Access Controls in Databricks and AWS S3,
- setting up CICD pipeline for codemodel deployment between environments,
- manage/maintain/monitor data pipelines and ML model performance.
- Bachelors's degree in Computer Science, Statistics, Informatics, Information Systems or another quantitative field.
- Experience with big data tools: Spark, Kafka, Spark & Kafka Streaming, Python, and Snowflake
- Working knowledge of Databricks API to automate process.
- Experience with Databricks Unity Catalog.
- Experience with Databricks Feature Store and Delta Live Tables
- Experience with Databricks MLFlow, model registry and creating endpoints for inference for realtimeatchstreaming applications.
- Experience with AWS S3 for data storage, creating S3 Lifecycles for data storage and retention.
Official notification