Develop and maintain an efficient and scalable data pipeline that retrieves data from various data sources to meet diverse business needs.
Utilize Big Data technologies such as Databricks, Hadoop, Kafka, Spark, Airflow, etc., to design and implement a big data management system and other services.
Implement best practices in data modeling, ETL processes, data integration, and data governance, ensuring the accuracy and reliability of data.
Collaborate with cross-functional teams to comprehend data requirements and implement tailored solutions.
Work closely with data scientists, data analysts, and other stakeholders to understand their data needs and provide technical solutions.
Troubleshoot and optimize existing data pipelines to enhance performance and reliability.
Stay informed about industry trends and emerging technologies to continually improve our data infrastructure.
Requirements
Bachelor's degree in Computer Science, Software Engineering, or related fields.
Have 3+ years of experience as a Data Engineer.
Proficient in either Python or Java programming language.
Proficient in database query languages such as SQL and NoSQL.
Knowledge of storage programming, distributed data processing, and big data technologies (Databricks, Hadoop, Spark, Kafka, Airflow, etc.).
Understanding of building and optimizing data processing flows (batch and stream processing).
Possession of international certifications in Data Engineering (AWS, CCA, CCP, IBM Certified Data Engineer, Google Professional Data Engineer, Databricks Certified Data Engineer Professional) is an advantage.
Familiarity with version control tools such as Git.
Candidates with finance and securities knowledge is also a plus.
Excellent communication and collaboration skills, with the ability to work effectively in a team environment.
Ownership and team-first mindset with strong responsibility.