The curious case of slow raytracing on a high end GPU

October 28, 2021November 14, 2021 Kostas Anagnostou6 Comments

I’ve typically been doing my compute shader based raytracing experiments with my toy engine on my ancient laptop that features an Intel HD4000 GPU. That GPU is mostly good to prove that the techniques work and to get some pretty screenshots but the performance is far from real-time with 1 ray-per-pixel GI for the following scene costing around 129 ms, rendering at 1280×720 (plugged in).

Continue reading →

Book review: 3D Graphics Rendering Cookbook

October 17, 2021October 28, 2021 Kostas AnagnostouLeave a comment

I was recently invited to review the new 3D Graphics Rendering Cookbook book by Sergey Kosarevsky and Viktor Latypov. The main focus of the book is the implementation of a large variety of graphics techniques using both modern OpenGL and Vulkan, an interesting approach that can show the parallels between the two graphics APIs and act as a steppingstone for less experienced programmers towards a better understanding of Vulkan.

Continue reading →

Raytracing tidbits

July 10, 2021October 17, 2021 Kostas Anagnostou1 Comment

Over the past few months I did some smaller scale raytracing experiments, which I shared on Twitter but never documented properly. I am collecting them all in this post for ease of access.

On ray divergence

Raytracing has the potential to introduce large divergence in a wave. Imagine a thread with a shadow ray shooting towards the light hitting a triangle and “stopping” traversal while the one next to it missing it and having to continue traversal of the BVH. Even a single long ray/thread has the potential to hold up the rest of the threads (63 on GCN and 31 on NVidia/RDNA) and prevent the whole wave from retiring and freeing up resources.

Continue reading →

Experiments in Hybrid Raytraced Shadows

May 15, 2021July 10, 2021 Kostas Anagnostou2 Comments

A few weeks ago I implemented a simple shadowmapping solution in the toy engine to try as a replacement for shadow rays during GI raytracing. Having the two solutions (shadomapping and RT shadows) side by side, along with some offline discussions I had, made me start thinking about how it would be possible to combine the two into a hybrid raytraced shadowed solution, like I did with hybrid raytraced reflections in the past. This blog post documents a few quick experiments I did to explore this issue a bit.

Continue reading →

How to read shader assembly

April 18, 2021May 15, 2021 Kostas Anagnostou1 Comment

When I started graphics programming, shading languages like HLSL and GLSL were not yet popular in game development and shaders were developed straight in assembly. When HLSL was introduced I remember us trying, for fun, to beat the compiler by producing shorter and more compact assembly code by hand, something that wasn’t that hard. Since then shader compiler technology has progressed immensely and nowadays, in most cases, it is pretty hard to produce better assembly code by hand (also the shaders have become so large and complicated that it is not cost effective any more anyway).

Continue reading →

RDNA 2 hardware raytracing

December 27, 2020April 9, 2021 Kostas Anagnostou1 Comment

Reading through the recently released RDNA 2 Instruction Set Architecture Reference Guide I came across some interesting information about raytracing support for the new GPU architecture. Disclaimer, the document is a little light on specifics so some of the following are extrapolations and may not be accurate.

According to the diagram released of the new RDNA 2 Workgroup Processor (WGP), a new hardware unit, the Ray Accelerator, has been added to implement ray/box and ray/triangle intersection in hardware.

Continue reading →

To z-prepass or not to z-prepass

December 21, 2020December 27, 2020 Kostas AnagnostouLeave a comment

Inspired by an interesting discussion on Twitter about its use in games, I put together some thoughts on the z-prepass and its use in the rendering pipeline.

To begin with, what is a z-prepass (zed-prepass, as we call it in the UK): in its most basic form it is a rendering pass in which we render large, opaque meshes (a partial z-prepass) or all the opaque meshes (a full z-prepass) in the scene using a vertex shader only, with no pixel shaders or rendertargets bound, to populate the depth buffer (aka z-buffer).

Continue reading →

What is shader occupancy and why do we care about it?

November 11, 2020September 12, 2021 Kostas Anagnostou1 Comment

I had a good question through Twitter DMs about what occupancy is and why is it important for shader performance, I am expanding my answer into a quick blog post.

First some context, GPUs, while running a shader program, batch together 64 or 32 pixels or vertices (called wavefronts on AMD or warps on NVidia) and execute a single instruction on all of them in one go. Typically, instructions that fetch data from memory have a lot of latency (i.e. the time between issuing the instruction and getting the result back is long), due to having to reach out to caches and maybe RAM to fetch data. This latency has the potential to stall the GPU while waiting for the data.

Continue reading →

Adding support for two-level acceleration for raytracing

November 1, 2020November 11, 2020 Kostas Anagnostou3 Comments

In my (compute shader) raytracing experiments so far I’ve been using a bounding volume hierarchy (BVH) of the whole scene to accelerate ray/box and ray/tri intersections. This is straightforward and easy to use and also allows for pre-baking of the scene BVH to avoid calculating it on load time.

This approach has at least 3 shortcomings though: first, as the (monolithic) BVH requires knowledge of the whole scene on bake, it makes it hard to update the scene while the camera moves around or to add/remove models to the scene due to gameplay reasons. Second, since the BVH stores bounding boxes/tris in world space, it makes it hard to raytrace animating models (without rebaking the BVH every frame, something very expensive). Last, the monolithic BVH stores every instance of the same model/mesh repeatedly, without being able to reuse it, potentially wasting large amounts of memory.

Continue reading →

Using Embree generated BVH trees for GPU raytracing

July 21, 2020November 1, 2020 Kostas Anagnostou1 Comment

Intel released it’s Embree collection of raytracing kernels, with source, sometime ago and I recently had the opportunity to try and compare the included BVH generation library against my own implementation in terms of BVH tree quality. The quality of a scene’s BVH is critical for quick traversal during raytracing and typically a number of techniques, such as the Surface Area Heuristic one I am currently using, is applied during the tree generation to improve it.

Continue reading →