Job Description
Job DescriptionZoox is building the world's most advanced self-driving hardware and software solution. The efficiency demands of such a system require an expert fine tuning of both the compute hardware architecture as well as the algorithms and middleware that runs on it to achieve maximum throughput at the most optimal power levels.
The Software Performance team’s mission is to analyze, optimize and provide guidance to the software and hardware teams in order to meet the required specifications.
As a GPU performance software engineer within the Software Performance team, you will instrument, monitor, analyze and optimize GPU based algorithms that are performance-critical for our solution. The scope for GPU usage ranges from traditional computer vision and deep learning architectures to complex geometric reasoning and multi-agent decision making. Your work will strongly influence design decisions of future compute platforms & resource allocation.In this role, you will:
- Build real-time instrumentation for performance monitoring (CPU, GPU, latency, memory) and develop offline benchmarking frameworks, tools, and scripts to evaluate & analyze performance at scale in CI/vehicle, and establish budgets for next-gen architectures.
- Analyze performance metrics to identify GPU hotspots and root causes, and propose and co-implement actionable solutions with component teams.
- Support teams on bringing serial algorithms to the GPU to maximize compute utilization and improve overall latency.
- Work as part of the Core team to design a middleware framework that promotes by default efficient and performant code development by maximizing CPU and GPU.
Qualifications
- BS in computer science or related field and 3+ years of experience.
- Strong knowledge of CUDA as applied to recent GPU microarchitectures (e.g., Ampere, Blackwell) and experience debugging/optimizing GPU kernels using tools like Nsight.
- Strong knowledge of C++ and experience in large code bases, comfortable in Linux development environments.
- Experience in development, debugging, and profiling of complex multiprocess systems (e.g., robotic systems, game engines).
Bonus Qualifications
- Experience with GPU kernel development in a real-time environment, including PTX-level programming, CPU SIMD instructions (e.g., AVX intrinsics), and custom CUDA layers with frameworks like TensorRT & XLA.
- Hands-on work with ML model optimization (post-training quantization, layer pruning, etc) or hand-tuning GPU kernels (in OpenGL, CUDA, RocM or similar).
- Proficiency with SQL, DataBricks, Looker, or other business intelligence tools.
There are three major components to compensation for this position: salary, Amazon Restricted Stock Units (RSUs), and Zoox Stock Appreciation Rights. A sign-on bonus may be offered as part of the compensation package. The listed range applies only to the base salary. Compensation will vary based on geographic location and level. Leveling, as well as positioning within a level, is determined by a range of factors, including, but not limited to, a candidate's relevant years of experience, domain knowledge, and interview performance. The salary range listed in this posting is representative of the range of levels Zoox is considering for this position. Zoox also offers a comprehensive package of benefits, including paid time off (e.g. sick leave, vacation, bereavement), unpaid time off, Zoox Stock Appreciation Rights, Amazon RSUs, health insurance, long-term care insurance, long-term and short-term disability insurance, and life insurance.
We may use artificial intelligence (AI) tools to support parts of the hiring process, such as reviewing applications, analyzing resumes, or assessing responses. These tools assist our recruitment team but do not replace human judgment. Final hiring decisions are ultimately made by humans. If you would like more information about how your data is processed, please contact us.
