Platform Engineer (4+)

tatacommunications | 88 days ago | Dholera

Job Responsibilities -

• Architect and implement a scalable, offline Data Lake for structured, semi-structured, and unstructured data in an on-premises, air-gapped environment.
• Collaborate with Data Engineers, Factory IT, and Edge Device teams to enable seamless data ingestion and retrieval across the platform.
• Integrate with upstream systems like MES, SCADA, and process tools to capture high-frequency manufacturing data efficiently.
• Monitor and maintain system health, including compute resources, storage arrays, disk I/O, memory usage, and network throughput.
• Optimize Data Lake performance via partitioning, deduplication, compression (Parquet/ORC), and implementing effective indexing strategies.
• Select, integrate, and maintain tools like Apache Hadoop, Spark, Hive, HBase, and custom ETL pipelines suitable for offline deployment.
• Build custom ETL workflows for bulk and incremental data ingestion using Python, Spark, and shell scripting.
• Implement data governance policies covering access control, retention periods, and archival procedures with security and compliance in mind.
• Establish and test backup, failover, and disaster recovery protocols specifically designed for offline environments.
• Document architecture designs, optimization routines, job schedules, and standard operating procedures (SOPs) for platform maintenance.
• Conduct root cause analysis for hardware failures, system outages, or data integrity issues.
• Drive system scalability planning for multi-fab or multi-site future expansions.

Essential Attributes (Tech-Stacks) -

• Hands-on experience designing and maintaining offline or air-gapped Data Lake environments.
• Deep understanding of Hadoop ecosystem tools: HDFS, Hive, Map-Reduce, HBase, YARN, zookeeper and Spark.
• Expertise in custom ETL design, large-scale batch and stream data ingestion.
• Strong scripting and automation capabilities using Bash and Python.
• Familiarity with data compression formats (ORC, Parquet) and ingestion frameworks (e.g., Flume).
• Working knowledge of message queues such as Kafka or RabbitMQ, with focus on integration logic.
• Proven experience in system performance tuning, storage efficiency, and resource optimization.

Qualifications -

• BE/ ME in Computer science, Machine Learning, Electronics Engineering, Applied mathematics, Statistics.

Desired Experience Level -

• 4 Years relevant experience post Bachelors
• 2 Years relevant experience post Masters
• Experience with semiconductor industry is a plus

Official notification

Join our Telegram group for daily job update

⚡ Hot Jobs Trending Now

SRE

Sr. SRE Engineer

Stripe | Bangalore, India

DEV

Backend Developer

Coinbase | Remote, India

Infra

Cloud Infra Lead

Datadog | Pune, India

MLOps Architect

Anthropic | Hyderabad

Data

Fivetran Data Eng.

Fivetran | Mumbai

SRE

Sr. SRE Engineer

Stripe | Bangalore, India

DEV

Backend Developer

Coinbase | Remote, India

Infra

Cloud Infra Lead

Datadog | Pune, India

MLOps Architect

Anthropic | Hyderabad

Data

Fivetran Data Eng.

Fivetran | Mumbai

SDE

Staff Software Eng.

Airbnb | Gurgaon, India

Prod

Platform Engineer

Databricks | Bangalore

Quality Assurance

GitLab | Remote

Security

Cloud Security

Zscaler | Mumbai

Product Designer

Figma | Pune, India

SDE