Data Engineer

Job Type: Permanent
Job Location: Makati
Work Setup: Fkexible
Experience Level: Senior

Responsibilities

  • Design and implement data processing systems using distributed frameworks like Hadoop, Spark, Snowflake, Airflow, or other similar technologies.
  • Build data pipelines to ingest data from various sources such as databases, APIs, or streaming platforms.
  • Integrate and transform data to ensure its compatibility with the target data model or format.
  • Design and optimize data storage architectures, including data lakes, data warehouses, or distributed file systems.
  • Implement techniques like partitioning, compression, or indexing to optimize data storage and retrieval.
  • Identify and resolve bottlenecks, tuning queries, and implementing caching strategies to enhance data retrieval speed and overall system efficiency.
  • Design and implement data models that support efficient data storage, retrieval, and analysis.
  • Collaborate with data scientists and analysts to understand their requirements and provide them with well-structured and optimized data for analysis and modeling purposes.
  • Utilize frameworks like Hadoop or Spark to perform distributed computing tasks, such as parallel processing, distributed data processing, or machine learning algorithms.
  • Implement security measures to protect sensitive data and ensuring compliance with data privacy regulations.
  • Establish data governance practices to maintain data integrity, quality, and consistency.
  • Monitor system performance, identifying anomalies, and conducting root cause analysis to ensure smooth and uninterrupted data operations.
  • Communicating complex technical concepts to non-technical stakeholders in a clear and concise manner.
  • Stay updated with emerging technologies, tools, and techniques in the field of big data engineering.

Requirements:

  • Strong analytical thinking and problem-solving skills
  • Strong communication skillset – ability to translate technical details to business/non-technical stakeholders
  • Extensive experience in designing and building data pipelines (ELT/ETL) for large-scale datasets. Familiarity with tools like Databricks, Apache Nifi, Apache Airflow, or Informatica is advantageous
  • Proficiency in programming languages such as Python, R or Scala is essential.
  • In-depth knowledge and experience with distributed systems and technologies, including On-prem Platforms, Apache Hadoop, Spark, Hive or similar frameworks. Familiarity with cloud-based platforms like AWS, Azure, or Google Cloud is highly desirable.
  • Solid understanding of data processing techniques such as batch processing, real-time streaming, and data integration. Experience with data analytics tools and frameworks like Apache Kafka, Apache Flink, or Apache Storm is a plus.
  • Experience with Azure Data Services – Databricks and Data Factory
  • Experience with Git repository maintenance and DevOps concepts
  • Familiarity with building, testing, and deploying process
  • Additional certifications in big data technologies or cloud platforms are advantageous

Apply for this position

Allowed Type(s): .pdf, .doc, .docx, .rtf