Responsibilities:
Model architecture optimizations : optimize latest LLM and GenAI model architectures for NPUs, which involves reimplementing basic building blocks of models for NPUs
Model Training and Fine-tuning: Fine-tune pre-trained models on specific tasks or datasets to improve performance. Implement state-of-the-art LLM training techniques such as Reinforcement Learning from Human Feedback (RLHF), ZeRO (Zero Redundancy Optimizer), Speculative Sampling, and other speculative techniques.
Data Management: Handle large datasets effectively. Ensure data quality and integrity. Implement data cleaning and preprocessing techniques. Hands-on with EDA is a plus.
Model Evaluation: Evaluate model performance using appropriate metrics. Understand the trade-offs between different evaluation metrics.
LLM metrics: Sound understanding of various LLM metrics like MMLU, Rouge, BLEU, Perplexity etc.
AWQ: Understanding of Quantization is a plus. Knowledge on QAT will be a plus.
Research and Development: Stay up to date with the latest research in NLP and LLMs. Implement state-of-the-art techniques and contribute to research efforts.
Infrastructure development : For coming up with new optimization techniques to minimize ONNX memory footprint, export time optimizations.
Collaboration: Work closely with other teams to understand requirements and implement solutions.
Required Skills and Experience:
Deep Learning Frameworks: Hands-on experience with PyTorch at a granular level. Familiarity with tensor operations, automatic differentiation, and GPU acceleration in PyTorch.
NLP and LLMs: Strong understanding of Natural Language Processing (NLP) and experience working with LLMs.
Programming: Proficiency in Python and experience with software development best practices.
Data Handling: Experience working with large datasets. Familiarity with data version control tools is a plus.
Education: A degree in Computer Science, Machine Learning, AI, or related field. Advanced degree is a plus.
Communication: Excellent written and verbal communication skills.
Work experience : Open, 4 – 10 years of relevant experience.
Preferred Skills:
Optimization: Knowledge of optimization techniques for training large models.
Neural Architecture Search (NAS): Experience with NAS techniques for optimizing model architectures is a plus.
Hands-on experience with CUDA, CUDNN and Triton-lang is a plus.
Minimum Qualifications:
• Bachelor's degree in Engineering, Information Systems, Computer Science, or related field and 3+ years of Software Engineering or related work experience.
OR
Master's degree in Engineering, Information Systems, Computer Science, or related field and 2+ years of Software Engineering or related work experience.
OR
PhD in Engineering, Information Systems, Computer Science, or related field and 1+ year of Software Engineering or related work experience.
Any question or remark? just write us a message
If you would like to discuss anything related to payment, account, licensing,
partnerships, or have pre-sales questions, you’re at the right place.