HiTakeJobHiTakeJob

AI Security - AI Platform Team Lead - Cato Networks

  • חברה: Cato Networks
  • מיקום: Tel Aviv District, Israel
  • טכנולוגיות: Go, CUDA, Kubernetes, Docker, PyTorch, ONNX, TensorRT

תיאור המשרה

3+ years of leadership experience as a team lead, tech lead, or engineering manager. 3+ years of hands-on experience in AI inference, production ML infrastructure, model serving, or AI runtime platforms. Strong experience with production inference technologies such as Triton, vLLM, CUDA, Kubernetes, Docker, PyTorch, ONNX, TensorRT , or similar. 3+ years of experience with Go , or strong experience with a similar high-performance backend language such as C++, Rust, or Java. Experience with performance optimization, scalability, observability, and SLO-driven production ownership. Strong system design skills, especially around distributed systems, performance, reliability, and production infrastructure. Advantages Experience with GPU optimization, GPU scheduling, GPU resource efficiency, quantization, runtime acceleration, or large-scale model serving.

תחומי אחריות

Build and lead Cato’s AI Platform team: hiring, mentoring, architecture, technical direction, and execution. Own the AI security runtime platform for high-throughput, low-latency inline security decisions across Cato’s global cloud and PoPs. Design the orchestration layer for running GPU models, CPU heuristics, and security logic as one production engine. Own production readiness: observability, SLOs, autoscaling, reliability, rollout, rollback, and operational health. Own the model lifecycle platform: registry, versioning, deployment, monitoring, and safe production rollout. Work closely with research and algorithm teams to productionize AI security models and algorithms at scale. Define the long-term platform strategy for AI runtime and model serving at Cato.

דרישות

3+ years of leadership experience as a team lead, tech lead, or engineering manager. 3+ years of hands-on experience in AI inference, production ML infrastructure, model serving, or AI runtime platforms. Strong experience with production inference technologies such as Triton, vLLM, CUDA, Kubernetes, Docker, PyTorch, ONNX, TensorRT , or similar. 3+ years of experience with Go , or strong experience with a similar high-performance backend language such as C++, Rust, or Java. Experience with performance optimization, scalability, observability, and SLO-driven production ownership. Strong system design skills, especially around distributed systems, performance, reliability, and production infrastructure. Advantages Experience with GPU optimization, GPU scheduling, GPU resource efficiency, quantization, runtime acceleration, or large-scale model serving.