TopDev

CÔNG TY TNHH ABC STUDIO VIỆT NAM

Data Engineer Internship (Specialized in Big Data Mining)

Hồ Chí Minh
Posted 1 week ago and Job expires in 1 hour ago

Year of experience

Job Level

Job Type

Contract type

About Us:

We, ABC Studio (AI Bigdata Content Studio), are a Korea–Vietnam AI company specializing in AI optimization, generative AI, and big data engineering — with a focus on market intelligence, visual content, and Edge AI for smart devices.

Our visions are:

  1. To become the global best market data company, having global e-commerce, KOLs, and SNS bigdata
  2. To become the leading innovative AI contents engineering company at movie (VFX) and webtoon and social marketing.
  3. To develop a solid AI infrastructure layer for efficient deployment on cloud, mobile, and embedded devices

We are waiting for enthusiastic and talented interns who are willing to accompany our long and meaningful journey together.

Job Description:

  • Perform web data mining, big data extraction from a variety of online sources.
  • Clean, transform, and validate data for use in analytics and machine learning applications.
  • Design and manage data warehouses, data lakes, and cloud-based storage solutions.
  • Automate data pipelines and workflows using Python, PySpark, and tools like Apache Airflow.

Responsibilities

  • Big Data Mining: Extract and mine large-scale datasets from major e-commerce platforms in Vietnam, China, Korea, Southeast Asia,…
  • Data Processing: Clean, transform raw data into structured formats suitable for analytics and machine learning.
  • Data Infrastructure: Build automated pipelines and cloud solutions. (e.g., AWS, GCP,…).
  • Data Integration and Management:** Develop data warehouses and data lakes for optimal data storage and retrieval.
  • LLM Data Pipeline: Develop pipelines for Large Language Models (LLM), including RAG , LangChain, or LangGraph.
  • Visualization: Create visualizations and reports to communicate insights effectively.

Requirements

  • Education: Final year student or fresh graduate in Computer Science, Data Science, Information Technology, or related fields
  • Technicail Skills:
    • Proficient in Python, with experience using Pandas, PySpark, or similar libraries
    • Experience with web scraping tools (e.g., BeautifulSoup, Scrapy, Selenium)
    • Understanding of data architecture: warehouses, lakes, and cloud storage
    • Familiarity with ETL/ELT tools (e.g., Apache Airflow) and SQL
    • Basic knowledge of web structures (HTML/CSS/JS) is a plus
  • Soft Skills: Strong problem-solving skills, attention to detail, and a passion for data engineering.
  • Communication skills: good communication skills in Vietnamese and English

Benefits

What You Will Learn

  • Web data mining and handling large-scale real-world datasets
  • Building automated data pipelines with Python, PySpark, and Airflow

3 job opening

Industry

Information Technology, Giải trí/ Game, MarTech, Artificial Intelligence

Company size

10-24 Employees

Nationality

Vietnam

od40BOShuJJRlXtN6MvYbD5QtoTBej316ekZdDK0.png