TopDev
job-image
Data Engineer Internship (Specialized in Big Data Mining)Đăng nhập để xem mức lương
Hồ Chí Minh
Intern FulltimeKhông yêu cầu
Hạn nộp hồ sơ: 03-09-2025

About Us:

We, ABC Studio (AI Bigdata Content Studio), are a Korea–Vietnam AI company specializing in AI optimization, generative AI, and big data engineering — with a focus on market intelligence, visual content, and Edge AI for smart devices.

Our visions are:

  1. To become the global best market data company, having global e-commerce, KOLs, and SNS bigdata
  2. To become the leading innovative AI contents engineering company at movie (VFX) and webtoon and social marketing.
  3. To develop a solid AI infrastructure layer for efficient deployment on cloud, mobile, and embedded devices

We are waiting for enthusiastic and talented interns who are willing to accompany our long and meaningful journey together.

Job Description:

  • Perform web data mining, big data extraction from a variety of online sources.
  • Clean, transform, and validate data for use in analytics and machine learning applications.
  • Design and manage data warehouses, data lakes, and cloud-based storage solutions.
  • Automate data pipelines and workflows using Python, PySpark, and tools like Apache Airflow.
1
Vai trò & trách nhiệm của bạn
  • Big Data Mining: Extract and mine large-scale datasets from major e-commerce platforms in Vietnam, China, Korea, Southeast Asia,…
  • Data Processing: Clean, transform raw data into structured formats suitable for analytics and machine learning.
  • Data Infrastructure: Build automated pipelines and cloud solutions. (e.g., AWS, GCP,…).
  • Data Integration and Management:** Develop data warehouses and data lakes for optimal data storage and retrieval.
  • LLM Data Pipeline: Develop pipelines for Large Language Models (LLM), including RAG , LangChain, or LangGraph.
  • Visualization: Create visualizations and reports to communicate insights effectively.
2
Kỹ năng & trình độ của bạn
  • Education: Final year student or fresh graduate in Computer Science, Data Science, Information Technology, or related fields
  • Technicail Skills:
    • Proficient in Python, with experience using Pandas, PySpark, or similar libraries
    • Experience with web scraping tools (e.g., BeautifulSoup, Scrapy, Selenium)
    • Understanding of data architecture: warehouses, lakes, and cloud storage
    • Familiarity with ETL/ELT tools (e.g., Apache Airflow) and SQL
    • Basic knowledge of web structures (HTML/CSS/JS) is a plus
  • Soft Skills: Strong problem-solving skills, attention to detail, and a passion for data engineering.
  • Communication skills: good communication skills in Vietnamese and English
3
Quyền lợi

What You Will Learn

  • Web data mining and handling large-scale real-world datasets
  • Building automated data pipelines with Python, PySpark, and Airflow
⚙️ Hỗ trợ ứng viên
🧑🏾‍💻 Chuẩn bị cho buổi phỏng vấnKiểm tra công cụ QnA của TopDev để luyện tập trả lời các câu hỏi phỏng vấn thường gặp.Đọc QnA phỏng vấn