HiTakeJobHiTakeJob

Big Data Engineer - Placerlabs

  • חברה: Placerlabs
  • מיקום: רמת גן
  • טכנולוגיות: Spark expertise (mandatory), PySpark/Scala (Mandatory), Data Engineering, Cloud Platforms: Hands-on experience with cloud platforms such as AWS, GCP, or Azure., SQL and Data Modeling, ETL Tools and Orchestration: Familiarity with ETL tools and frameworks, such as Apache Airflow., Programming Languages: Proficiency in programming languages like Python, Java, or Scala.

תיאור המשרה

Bachelor's or Master's degree in Computer Science, Engineering, or a related field. 5+ years of professional experience in software development, with at least 3 years as a Data Engineer. Spark expertise (mandatory): Strong proficiency in Apache Spark, including hands-on experience with building data processing applications and pipelines using Spark's core libraries. PySpark/Scala (Mandatory): Proficiency in either PySpark (Python API for Spark) or Scala for Spark development. Data Engineering: Proven track record in designing and implementing ETL pipelines, data integration, and data transformation processes. Cloud Platforms: Hands-on experience with cloud platforms such as AWS, GCP, or Azure. SQL and Data Modeling: Solid understanding of SQL, relational databases, and data modeling. Big Data Technologies: Familiarity with big data technologies beyond Spark, such as Hadoop ecosystem components, data serialization formats (Parquet, Delta), and distributed computing concepts. Programming Languages: Proficiency in programming languages like Python, Java, or Scala. ETL Tools and Orchestration: Familiarity with ETL tools and frameworks, such as Apache Airflow. Problem-Solving: Strong analytical and problem-solving skills. Collaboration and Communication: Effective communication skills and collaboration within cross-functional teams. Geospatial Domain (Preferred): Prior experience in the geospatial or location analytics domain is a plus.

תחומי אחריות

Data Pipeline Architecture and Development: Design, build, and optimize robust and scalable data pipelines to process, transform, and integrate large volumes of data from various sources into our analytics platform. Data Quality Assurance: Implement data validation, cleansing, and enrichment techniques to ensure high-quality and consistent data across the platform. Performance Optimization: Identify performance bottlenecks and optimize data processing and storage mechanisms to enhance overall system performance and reduce latency. Cloud Infrastructure: Work extensively with cloud-based technologies (GCP and AWS), to design and manage scalable data infrastructure. Collaboration: Collaborate with cross-functional teams including Data Analysts, Data Scientists, Product Managers, and Software Engineers to understand requirements and deliver solutions that meet business needs. Data Governance: Implement and enforce data governance practices, ensuring compliance with relevant regulations and best practices related to data privacy and security. Monitoring and Maintenance: Monitor the health and performance of data pipelines, troubleshoot issues, and ensure high availability of data infrastructure. Mentorship: Provide technical guidance and mentorship to junior data engineers, fostering a culture of learning and growth within the team.

דרישות

Bachelor's or Master's degree in Computer Science, Engineering, or a related field. 5+ years of professional experience in software development, with at least 3 years as a Data Engineer. Spark expertise (mandatory): Strong proficiency in Apache Spark, including hands-on experience with building data processing applications and pipelines using Spark's core libraries. PySpark/Scala (Mandatory): Proficiency in either PySpark (Python API for Spark) or Scala for Spark development. Data Engineering: Proven track record in designing and implementing ETL pipelines, data integration, and data transformation processes. Cloud Platforms: Hands-on experience with cloud platforms such as AWS, GCP, or Azure. SQL and Data Modeling: Solid understanding of SQL, relational databases, and data modeling. Big Data Technologies: Familiarity with big data technologies beyond Spark, such as Hadoop ecosystem components, data serialization formats (Parquet, Delta), and distributed computing concepts. Programming Languages: Proficiency in programming languages like Python, Java, or Scala. ETL Tools and Orchestration: Familiarity with ETL tools and frameworks, such as Apache Airflow. Problem-Solving: Strong analytical and problem-solving skills. Collaboration and Communication: Effective communication skills and collaboration within cross-functional teams. Geospatial Domain (Preferred): Prior experience in the geospatial or location analytics domain is a plus.