ABOUT THE ROLE We’re looking for a Machine Learning Engineer to own and push the limits of model inference performance at scale. You’ll work at the intersection of research and production—turning cutting-edge models into fast, reliable, and cost-efficient systems that serve real users. This role is ideal for someone who enjoys deep technical work, profiling systems down to the kernel/GPU level, and translating research ideas into production-grade performance gains. WHAT YOU’LL DO - Optimize inference latency, throughput, and cost for large-scale ML models in production - Profile and
Sign in to apply — one profile, every role on PreferHired.
Sign in to apply