Design, develop, and optimize advanced AI models (e.g., CNN, NLP, VLM, LLM) for deployment on device platforms such as Intel, NVIDIA, Ambarella, and Hailo.
Perform model conversion and adaptation using frameworks like TensorFlow/TensorFlow Lite, PyTorch, ONNX, and TensorRT, ensuring compatibility and performance.
Develop and fine-tune custom kernels and operators with low-level programming (C/C++, CUDA) to maximize efficiency.
Collaborate on hardware-software co-design initiatives to integrate AI solutions seamlessly.
Optimize model performance, latency, and power efficiency on device.
Provide technical expertise, detailed documentation, and insights to support cross-functional teams.
Requirements
Bachelor's/Master's degree in Computer Science, Machine Learning, or any related fields.
3- 5 years of hands-on experience in AI model development, optimization, and deployment.
Expert in using framework TensorFlow, PyTorch, and conversion tools (ONNX, TensorRT, OpenVINO).
Strong understanding of AI model architectures (CNN, NLP, VLM, LLM) and proficiency in training and evaluation techniques.
Proven experience with hardware architectures and optimization techniques.
Creative, proactive, and independence in learning and problem-solving.
Outstanding analytical and problem-solving skills.