GPU Programming

High-performance rendering and simulation using CUDA and modern GPU APIs.

glRemix

glRemix Dec 2025

  • Created a DirectX 12 platform to remaster OpenGL 1.0 apps with real-time path tracing without touching its source code.
  • Intercepted OpenGL calls with a shim layer, and used interprocess communication to send to DX12 renderer.
  • Implemented features such as asset replacement during runtime, texture and environment mapping, and shadow maps.
CUDA Path Tracer

CUDA GPU Path Tracer Oct 2025

  • Built a GPU-accelerated pathtracer using CUDA and C++, supporting multiple material shaders, antialiasing, and more.
  • Integrated bounding volume hierarchy acceleration to optimize loading .OBJ files, creating a 9x speedup for large meshes.
  • Added the ability to make fast and complex renders with texture mapping, and stream compaction for path termination.
WebGPU Gaussian Splats

WebGPU Gaussian Splats Nov 2025

  • Built a real-time Gaussian Splat renderer using the WebGPU API, to create realistic and fast interactive 3D scenes.
  • Transformed point cloud data into blended ellipsoids with calculated position, color, opacity, and size attributes.
  • Set up pipelines for GPU parallelism, including a compute shader for covariance, frustum culling, depth sorting, etc.
WebGPU rendering demo

WebGPU Forward+ and Clustered Deferred Shaders Oct 2025

  • Used the WebGPU API and implemented Forward+ and Clustered Deferred shading pipelines to optimizing the processing of several dynamic lights.
  • Integrated geometry-buffer optimizations and clustering for light culling, resulting in an average of 25x speedup.
  • Achieved an average 25× lighting performance speedup.
  • View a live demo here!
CUDA Boids simulation

CUDA Boids Flocking Simulator Sep 2025

  • Developed a real-time, optimized flocking simulation on the GPU using CUDA, based on the Reynolds Boids algorithm.
  • Implemented spatial-partitioning algorithms for memory coherence, resulting in a speedup from quadratic to linear time.
  • Used CUDA event timers and NVIDIA NSight platforms to analyze and evaluate performance of flocking simulations.
Stream compaction visualization

GPU Stream Compaction Sep 2025

  • Implemented stream compaction on the GPU using CUDA, based on the NVIDIA GPU Gems 3 book.
  • Designed a work-efficient scan to use for stream compaction, using up-sweep and down-sweep phases to reduce redundant parallel work and creating a 2.5x speedup from the naive scan.
  • Used NVIDIA Nsight Systems and Nsight Compute to analyze memory and kernel performance.

Computer Graphics Introduction

CPU-based rendering, simulation, and OpenGL graphics pipelines.

CPU Path Tracer render

CPU Path Tracer Mar 2024

  • Used bounding volume hierarchy algorithm to accelerate ray intersections with triangles and spheres.
  • Rendered images using global illumination (direct/indirect lighting), and optimized using Monte Carlo probability.
Cloth simulation

Cloth Simulator Apr 2024

  • Utilized mass and spring system to create a cloth simulation and its corresponding collisions with other objects.
  • Created Blinn-Phong, bump, displacement, texture, and other shaders for the cloth using the OpenGL API.
Mesh editing project

Mesh Edit Mar 2024

  • Built Bezier curves using algorithms such as the de Casteljau algorithm
  • Implemented loop subdivision for mesh upsampling — creating more mesh subdivisions to smoothen the mesh
Rasterizer output

Rasterizer Feb 2024

  • Incorporated supersampling to have less aliasing in images with jagged pixel edges
  • Gained an understanding of Barycentric coordinates by creating a color wheel using the necessary formulas

Animation

3D animated short films created using Autodesk Maya.