1. AI Model Optimization & Deployment:
- Analyze trained AI models (from TensorFlow, PyTorch) to identify performance bottlenecks and resource requirements (memory, compute).
- Convert models into inference-optimized formats such as ONNX, TensorRT, and TFLite.
- Apply advanced optimization techniques, including quantization (INT8/FP16) and pruning, to reduce model size and accelerate processing speed while maintaining accuracy.
- Profile and benchmark model performance on target hardware (e.g., NVIDIA Jetson, ARM CPUs) to ensure latency and throughput criteria are met.
2. Application Software Development:
- Build high-performance applications and libraries in C++/Python to load, manage, and execute AI models in both Linux and Windows environments.
- Develop end-to-end data processing pipelines, from pre-processing input data (images, video) to post-processing model outputs.
- Create and maintain unit and integration tests to ensure the stability and accuracy of AI features.
3. Research & Improvement:
- Stay current with the latest technologies, algorithms, and tools in embedded AI and efficient machine learning.
- Research and experiment with new AI models to evaluate their feasibility and potential for product application.
- Participate in troubleshooting, debugging, and continuously improving deployed AI systems to enhance performance and reliability.
4. Collaboration & Technical Support:
- Collaborate closely with AI/ML scientists to understand model architectures and deployment requirements.
- Work with hardware engineering teams to leverage specialized on-chip acceleration features.
- Support other teams (such as QA and Product) in integrating and testing AI solutions.