AI Security - AI Platform Engineer - Cato Networks
- חברה: Cato Networks
- מיקום: Tel Aviv District, Israel
- טכנולוגיות: Go, CUDA, Kubernetes, Docker, PyTorch, Triton, vLLM, ONNX, TensorRT
תיאור המשרה
3+ years of hands-on experience in AI inference, production ML infrastructure, model serving, or MLOps.
Experience with production inference technologies such as Triton, vLLM, CUDA, Kubernetes, Docker, PyTorch, ONNX, TensorRT , or similar.
Strong understanding of low-latency, high-throughput production systems.
Experience with model lifecycle concepts: model registry, versioning, deployment, rollout, rollback, monitoring, and observability.
3+ years of experience with Go , or strong experience with a similar high-performance backend language such as C++, Rust, or Java.
תחומי אחריות
Build Cato’s AI security runtime platform for high-throughput, low-latency production serving.
Develop infrastructure for model serving, multi-model orchestration, and inline decision flows.
Optimize inference performance: batching, caching, streaming, GPU utilization, memory usage, and runtime acceleration.
Build backend orchestration and performance-critical services in Go.
Support the model lifecycle: registry integration, packaging, versioning, deployment, monitoring, and operational health.
Work closely with research and algorithm teams to productionize AI security models and algorithms at scale.
דרישות
3+ years of hands-on experience in AI inference, production ML infrastructure, model serving, or MLOps.
Experience with production inference technologies such as Triton, vLLM, CUDA, Kubernetes, Docker, PyTorch, ONNX, TensorRT , or similar.
Strong understanding of low-latency, high-throughput production systems.
Experience with model lifecycle concepts: model registry, versioning, deployment, rollout, rollback, monitoring, and observability.
3+ years of experience with Go , or strong experience with a similar high-performance backend language such as C++, Rust, or Java.