HiTakeJobHiTakeJob

Data Engineering Team Lead (Agentic Search) - Nebius

  • חברה: Nebius
  • מיקום: Israel
  • טכנולוגיות: Python, SQL, Snowflake, BigQuery, Airflow, Spark, Kafka, GCP, AWS

תיאור המשרה

Lead and architect Tavily's data platform - from real-time ingestion through data warehouse medallion layers to consumer-facing datasets and dashboards Lead, hire, mentor, and grow a team of data engineers; set engineering standards for code quality, testing, documentation, and on-call Work closely with engineers across the company to make sure batch and streaming pipelines are done correctly Define and implement observability for the data platform: data quality checks, freshness monitors, lineage, schema evolution, and cost controls Partner with researchers, engineers , analysts, finance, and product managers to deliver trustworthy datasets for product & gtm analytics. Define the objects, entities, and relationships that model Tavily's search domain - agent inputs, URLs, chunks, agent sessions, crawls, and the connections between them - and turn that mental model into a clean, queryable data model that the rest of the company can reason about. Data Governance: Can ensure the highest standards of data quality, integrity, and security across all environments. 5+ years of Data Engineering experience, with a focus on designing and implementing scalable, analytics-ready data models and cloud data warehouses (e.g., BigQuery, Snowflake). Have hands-on experience with Snowflake (or a comparable cloud data warehouse) and a strong grasp of data warehouse architecture preferably medallion schema. Deep knowledge of databases (schema design, query optimization) and familiarity with NoSQL use cases. Expertise in modern data orchestration and transformation frameworks (e.g., Airflow, DBT ). Solid understanding of cloud data services (e.g., AWS, GCP) and streaming platforms (e.g., Kafka, Pub/Sub). Have hands-on experience with the Spark / MapReduce paradigm and understand when distributed processing is the right tool Are fluent in Python and SQL for production data work Have operated data systems in production: debugged them under pressure, recovered from data incidents, and understand what it means to backfill a corrupted table. Competitive compensation Career growth and learning opportunities Flexibility and work-life balance Collaborative and innovative culture Opportunity to work on impactful AI projects International environment and talented teams

תחומי אחריות

Lead and architect Tavily's data platform - from real-time ingestion through data warehouse medallion layers to consumer-facing datasets and dashboards Lead, hire, mentor, and grow a team of data engineers; set engineering standards for code quality, testing, documentation, and on-call Work closely with engineers across the company to make sure batch and streaming pipelines are done correctly Define and implement observability for the data platform: data quality checks, freshness monitors, lineage, schema evolution, and cost controls Partner with researchers, engineers , analysts, finance, and product managers to deliver trustworthy datasets for product & gtm analytics. Define the objects, entities, and relationships that model Tavily's search domain - agent inputs, URLs, chunks, agent sessions, crawls, and the connections between them - and turn that mental model into a clean, queryable data model that the rest of the company can reason about. Data Governance: Can ensure the highest standards of data quality, integrity, and security across all environments. 5+ years of Data Engineering experience, with a focus on designing and implementing scalable, analytics-ready data models and cloud data warehouses (e.g., BigQuery, Snowflake). Have hands-on experience with Snowflake (or a comparable cloud data warehouse) and a strong grasp of data warehouse architecture preferably medallion schema. Deep knowledge of databases (schema design, query optimization) and familiarity with NoSQL use cases. Expertise in modern data orchestration and transformation frameworks (e.g., Airflow, DBT ). Solid understanding of cloud data services (e.g., AWS, GCP) and streaming platforms (e.g., Kafka, Pub/Sub). Have hands-on experience with the Spark / MapReduce paradigm and understand when distributed processing is the right tool Are fluent in Python and SQL for production data work Have operated data systems in production: debugged them under pressure, recovered from data incidents, and understand what it means to backfill a corrupted table. Competitive compensation Career growth and learning opportunities Flexibility and work-life balance Collaborative and innovative culture Opportunity to work on impactful AI projects International environment and talented teams

דרישות

Lead and architect Tavily's data platform - from real-time ingestion through data warehouse medallion layers to consumer-facing datasets and dashboards Lead, hire, mentor, and grow a team of data engineers; set engineering standards for code quality, testing, documentation, and on-call Work closely with engineers across the company to make sure batch and streaming pipelines are done correctly Define and implement observability for the data platform: data quality checks, freshness monitors, lineage, schema evolution, and cost controls Partner with researchers, engineers , analysts, finance, and product managers to deliver trustworthy datasets for product & gtm analytics. Define the objects, entities, and relationships that model Tavily's search domain - agent inputs, URLs, chunks, agent sessions, crawls, and the connections between them - and turn that mental model into a clean, queryable data model that the rest of the company can reason about. Data Governance: Can ensure the highest standards of data quality, integrity, and security across all environments. 5+ years of Data Engineering experience, with a focus on designing and implementing scalable, analytics-ready data models and cloud data warehouses (e.g., BigQuery, Snowflake). Have hands-on experience with Snowflake (or a comparable cloud data warehouse) and a strong grasp of data warehouse architecture preferably medallion schema. Deep knowledge of databases (schema design, query optimization) and familiarity with NoSQL use cases. Expertise in modern data orchestration and transformation frameworks (e.g., Airflow, DBT ). Solid understanding of cloud data services (e.g., AWS, GCP) and streaming platforms (e.g., Kafka, Pub/Sub). Have hands-on experience with the Spark / MapReduce paradigm and understand when distributed processing is the right tool Are fluent in Python and SQL for production data work Have operated data systems in production: debugged them under pressure, recovered from data incidents, and understand what it means to backfill a corrupted table. Competitive compensation Career growth and learning opportunities Flexibility and work-life balance Collaborative and innovative culture Opportunity to work on impactful AI projects International environment and talented teams