Madrona v0.2: Introducing the Engine's High Throughput Batch Renderer

Posted on Sunday, October 13th 2024 by Luc Rosenzweig

Example batch renderer outputs of the high geometry HSSD scenes.

“Pixels to actions” training is an exciting training regime which has agents learning through images of what they see. Given that Madrona is a high-throughput batch simulator, we were compelled to explore what the most efficient architecture would be for a high-throughput batch renderer capable of rendering thousands of small images (64x64 to 256x256 in resolution) from the agents’ point of view. Such a renderer would have different requirements than your typical renderer for a video game. For one, throughput is much more important than latency - it doesn’t matter if rendering a single 128x128 image takes 15ms so long as rendering 1024 images at once also takes 15ms. Furthermore, realism isn’t a hard requirement like it is in certain video games: learning experiments don’t require full path tracers and can make due with simple phong lighting as seen here. There are two established algorithms for image synthesis: rasterization and ray tracing. We therefore explored both approaches with what we believe to be well-tuned implementations. Our findings show that the software ray tracer performs better than the rasterizer in many important situations such as training on an H100, on high geometry scenes or when in need of supporting effects like shadows.

For more information, see the following resources:

Renderer Technical Paper

Please refer to our technical paper for an in-depth description of Madrona's batch renderer.

Abstract

In this paper we study the problem of efficiently rendering images for embodied AI training workloads, where agent training involves rendering millions to billions of independent, low-resolution frames, often with simple lighting and shading, that serve as the agent's observations of the world. To enable high-throughput training from images, we design a flexible, batch-mode rendering interface that allows state-of-the-art GPU-accelerated batch world simulators to efficiently communicate with high-performance rendering backends. Using this interface we architect and compare two high-performance renderers: one based on the GPU hardware-accelerated graphics pipeline and a second based on a GPU software implementation of ray tracing. To evaluate these renderers and encourage further research by the graphics community in this area, we build a rendering benchmark for this under-explored regime. We find that the ray tracing renderer outperforms the rasterization-based solution across the benchmark on a datacenter-class GPU, while also performing competitively in geometrically complex environments on a high-end consumer GPU. When tasked to render large batches of independent 128x128 images, the ray tracer can exceed 100,000 frames per second per GPU for simple scenes, and exceed 10,000 frames per second per GPU on geometrically complex scenes from the HSSD dataset.

Citation

@article{rosenzweig24madronarenderer,
    title   = {High-Throughput Batch Rendering for Embodied AI},
    author  = {Luc Guy Rosenzweig and Brennan Shacklett and
               Warren Xia and 
               Kayvon Fatahalian},
    conference = {SIGGRAPH Asia 2024 Conference Papers},
    year    = {2024}
}

FAQ

How programmable is the ray tracer?

Currently, the output of the ray tracer is hardcoded to RGB and depth with optional shadows as you can see here. However, we plan to create a way for users to program their own "ray tracing shaders" in the Madrona framework in the future. For now, users must hardcode modifications at the location linked previously.

I need a batch renderer for my non-Madrona application! How can I use your renderers for my use case?

See the Madrona Renderer repository linked in the additional resources linked above the paper. There, you can see an example of just using the renderer which can be plugged into whatever application you like!