Data Engineering Team Lead (Agentic Search) - Nebius
- חברה: Nebius
- מיקום: Israel
- טכנולוגיות: Python, SQL, Snowflake, BigQuery, Airflow, Spark, Kafka, GCP, AWS
תיאור המשרה
Lead and architect Tavily's data platform - from real-time ingestion through data warehouse medallion layers to consumer-facing datasets and dashboards
Lead, hire, mentor, and grow a team of data engineers; set engineering standards for code quality, testing, documentation, and on-call
Work closely with engineers across the company to make sure batch and streaming pipelines are done correctly
Define and implement observability for the data platform: data quality checks, freshness monitors, lineage, schema evolution, and cost controls
Partner with researchers, engineers , analysts, finance, and product managers to deliver trustworthy datasets for product & gtm analytics.
Define the objects, entities, and relationships that model Tavily's search domain - agent inputs, URLs, chunks, agent sessions, crawls, and the connections between them - and turn that mental model into a clean, queryable data model that the rest of the company can reason about.
Data Governance: Can ensure the highest standards of data quality, integrity, and security across all environments.
5+ years of Data Engineering experience, with a focus on designing and implementing scalable, analytics-ready data models and cloud data warehouses (e.g., BigQuery, Snowflake).
Have hands-on experience with Snowflake (or a comparable cloud data warehouse) and a strong grasp of data warehouse architecture preferably medallion schema.
Deep knowledge of databases (schema design, query optimization) and familiarity with NoSQL use cases.
Expertise in modern data orchestration and transformation frameworks (e.g., Airflow, DBT ).
Solid understanding of cloud data services (e.g., AWS, GCP) and streaming platforms (e.g., Kafka, Pub/Sub).
Have hands-on experience with the Spark / MapReduce paradigm and understand when distributed processing is the right tool
Are fluent in Python and SQL for production data work
Have operated data systems in production: debugged them under pressure, recovered from data incidents, and understand what it means to backfill a corrupted table.
Competitive compensation
Career growth and learning opportunities
Flexibility and work-life balance
Collaborative and innovative culture
Opportunity to work on impactful AI projects
International environment and talented teams
תחומי אחריות
Lead and architect Tavily's data platform - from real-time ingestion through data warehouse medallion layers to consumer-facing datasets and dashboards
Lead, hire, mentor, and grow a team of data engineers; set engineering standards for code quality, testing, documentation, and on-call
Work closely with engineers across the company to make sure batch and streaming pipelines are done correctly
Define and implement observability for the data platform: data quality checks, freshness monitors, lineage, schema evolution, and cost controls
Partner with researchers, engineers , analysts, finance, and product managers to deliver trustworthy datasets for product & gtm analytics.
Define the objects, entities, and relationships that model Tavily's search domain - agent inputs, URLs, chunks, agent sessions, crawls, and the connections between them - and turn that mental model into a clean, queryable data model that the rest of the company can reason about.
Data Governance: Can ensure the highest standards of data quality, integrity, and security across all environments.
5+ years of Data Engineering experience, with a focus on designing and implementing scalable, analytics-ready data models and cloud data warehouses (e.g., BigQuery, Snowflake).
Have hands-on experience with Snowflake (or a comparable cloud data warehouse) and a strong grasp of data warehouse architecture preferably medallion schema.
Deep knowledge of databases (schema design, query optimization) and familiarity with NoSQL use cases.
Expertise in modern data orchestration and transformation frameworks (e.g., Airflow, DBT ).
Solid understanding of cloud data services (e.g., AWS, GCP) and streaming platforms (e.g., Kafka, Pub/Sub).
Have hands-on experience with the Spark / MapReduce paradigm and understand when distributed processing is the right tool
Are fluent in Python and SQL for production data work
Have operated data systems in production: debugged them under pressure, recovered from data incidents, and understand what it means to backfill a corrupted table.
Competitive compensation
Career growth and learning opportunities
Flexibility and work-life balance
Collaborative and innovative culture
Opportunity to work on impactful AI projects
International environment and talented teams
דרישות
Lead and architect Tavily's data platform - from real-time ingestion through data warehouse medallion layers to consumer-facing datasets and dashboards
Lead, hire, mentor, and grow a team of data engineers; set engineering standards for code quality, testing, documentation, and on-call
Work closely with engineers across the company to make sure batch and streaming pipelines are done correctly
Define and implement observability for the data platform: data quality checks, freshness monitors, lineage, schema evolution, and cost controls
Partner with researchers, engineers , analysts, finance, and product managers to deliver trustworthy datasets for product & gtm analytics.
Define the objects, entities, and relationships that model Tavily's search domain - agent inputs, URLs, chunks, agent sessions, crawls, and the connections between them - and turn that mental model into a clean, queryable data model that the rest of the company can reason about.
Data Governance: Can ensure the highest standards of data quality, integrity, and security across all environments.
5+ years of Data Engineering experience, with a focus on designing and implementing scalable, analytics-ready data models and cloud data warehouses (e.g., BigQuery, Snowflake).
Have hands-on experience with Snowflake (or a comparable cloud data warehouse) and a strong grasp of data warehouse architecture preferably medallion schema.
Deep knowledge of databases (schema design, query optimization) and familiarity with NoSQL use cases.
Expertise in modern data orchestration and transformation frameworks (e.g., Airflow, DBT ).
Solid understanding of cloud data services (e.g., AWS, GCP) and streaming platforms (e.g., Kafka, Pub/Sub).
Have hands-on experience with the Spark / MapReduce paradigm and understand when distributed processing is the right tool
Are fluent in Python and SQL for production data work
Have operated data systems in production: debugged them under pressure, recovered from data incidents, and understand what it means to backfill a corrupted table.
Competitive compensation
Career growth and learning opportunities
Flexibility and work-life balance
Collaborative and innovative culture
Opportunity to work on impactful AI projects
International environment and talented teams