Binary Mind: 2012

Wednesday, November 21, 2012

Semi-real-time ray tracing?

(Disclaimer: I know nothing about ray tracing (yet) nor have I any experience with writing 3D renderers)

Regarding my microcosm game idea where each viewpoint is rendered on the fly, in semi real time, here's a brain dump on the subject or ray tracing. I have been reading a lot on the subject of "interactive ray tracing" during my thanksgiving vacation. Mostly the papers by Ingo Wald, who apparently has done a lot of research into the subject. His 2004 thesis paper Realtime Ray Tracing and Interactive Global Illumination suggests that multiple frames per second can be achieved by:

Tracing multiple "coherent" ray "packets" at once. (similar directions)
SIMD instructions (SSE) to process multiple ray intersections at once
Awareness of processor cache and predictable memory access (keep related data close together and minimize random memory access)
Axis-aligned BSP trees (with SIMD traversal and early termination)
Optimal BSP construction using surface area heuristics
"Instant Global Illumination" (IGI) using "vitual point lights" (VPLs) placed in a pre processing step.
Optimal placment of the VPLs by using Quasi monte-carlo methods (importance sampling I think).
Shadow rays are traced to these VPLs from the first bounce (screen rays) and averaged.
Instead of sampling all VPLs at each hit, sample a small subset but change the subset for every pixel (interleaving)
Blur the VPL samples together on regions of the image which are continuous (flat surfaces) to remove artifacts (he calls it discontinuity buffering)

My thoughts:

His engine gets maybe 10 FPS spread across a small cluster of computers (back in 2004?).
I have only one CPU and perhaps GPU to work with.
I need only, say, 4 FPS (250ms from the time the user clicks to the time they see the next viewpoint).
For lower end machines we can turn off features like anti-aliasing and reduce the number of VPLs sampled. etc.

Perhaps faster tracing can be achieved by:

Instead of handling millions of triangles (his tracer does only triangle meshes), support only mathematical geometric primitives: spheres, boxes, cylinders, cones, pyramids etc. And of course, support CSG operations between them. Yes, this would limit the amount of detail that could be modeled, but perhaps that's acceptable. It would create a sort of visual style to the worlds - designers would focus less on highly detailed modeling and more on artistic simplicity. Perhaps, to support more organic, free-form shapes, subdivision surfaces could be supported. They could be subdivided on the fly as Ingo Wald outlined in Packet-based Ray Tracing of Catmull-Clark Subdivision Surfaces.

The theoretical advantage I see in this approach is:

Less RAM needed to hold the scene which means less access to RAM and more would fit in the CPU cache.
Smoother curves (eg, a perfect sphere instead of triangles approximating a sphere)
Fewer primitives in the acceleration structure (BSP or BVH) means faster traversal?
More opportunities for bulk processing (keep reading)

A bulk processing algorithm:

Instead of executing the entire render equation for each pixel, what if we execute steps in the render equation piece by piece. This would increase caching performance and SIMD opportunities wouldn't it? An algorithm like this:

First shoot the primary rays from the camera (in packets) into the scene. Stop traversal as soon as the bounding box for a complex primitive (like a subdivision surface) is hit. Perhaps even stop on the bounding box for any primitive.
For each struck bounding box (sorted by type of primitive) calculate the exact hit point of all the rays which hit the bounding box. Repeat steps 1 & 2 to handle rays which miss the actual object.
For each struck primitive (sorted by material type), compute reflection or refraction ray. (Are reflection rays only necessary for certain material types?)
At the same time compute rays to VPLs and other light sources.
Supersampling for anti-aliasing fits in here somewhere
Trace all these rays in bulk.
Compute material colors where rays hit (in bulk, sorted by material)
Multiply material colors with lighting values in bulk and save pixel values

Such an algorithm will have a high memory footprint but I believe it will be largely a process of streaming into and from memory (predictable) instead of random access. Certain portions would also lend themselves well to bulk processing by the GPU. Perhaps many parts could be written in OpenCL?

The big unknowns to me right now are how occlusion culling is achieved since the acceleration structure does not store fine triangle meshes but instead entire implicit surfaces. In other words, how do you find the first object which the ray hits because many bounding boxes will overlap.

Secondly, how many reflection bounces are needed? My reading says that IGI handles color bleeding so it seems that you'd only need reflection bounces on shiny or glossy surfaces.

Changing subject back to primitive types. I really couldn't live without rock. Perhaps some sort of grid based rock as described by the Arches terrain framework?

Thursday, October 11, 2012

one overriding idea

An insightful observation from Linus Torvalds:

Btw, it's not just microkernels. Any time you have "one overriding idea", and push your idea as a superior ideology, you're going to be wrong. Microkernels had one such ideology, there have been others. It's all BS. The fact is, reality is complicated, and not amenable to the "one large idea" model of problem solving. The only way that problems get solved in real life is with a lot of hard work on getting the details right. Not by some over-arching ideology that somehow magically makes things work.

Wednesday, May 30, 2012

Microcosm (game idea)

This week's subject of idle thought is what sort of game I would enjoy making and actually have a chance at finishing given my ever shrinking personal time.

I think I've mentioned before that I'm a huge fan or Riven and Myst III. It's just so awesome to explore those beautiful strange worlds! That's what I want to create. That's something I would enjoy making.

So here's my half baked idea for a game. Let's call it Microcosm:

You are in a strange land. It is beautiful. You never imagined that such a place could exist. It's slightly odd because certain laws of nature don't seem to apply here. You feel compelled to explore. There are, of course, minor obstacles to your exploration. Hidden passages. Steep cliffs. Strange symbols. But these obstacles are minor and only add to the joy of exploration.

You are also searching for something. The microcosm. Perhaps it's a gemstone. Or maybe it's behind a brick in the wall. Maybe it's that puddle of water. There have been clues along the way, and there will be no doubt when you find it. The microcosm glows with an eerie light. As you look more closely you see great detail and hear faint sounds. The moment you touch it you begin to shrink. The image in the microcosm gets larger and clearer and the land you were in quickly fades away. You are now in a new place. How strange and beautiful it is. You must explore!

Hmm. That's sounding very similar to the Myst series. Oh well. At least it will appeal to Myst fans like myself. I have a couple of tweaks to the genre which would hopefully set it apart:

Community created: The game has no end. The community would create more worlds to explore. Perhaps there would be some sort of hub-world that was littered with microcosms.
Procedural: The worlds will primarily be procedurally generated. No intricate buildings or contraptions. This is where I'm supposed to save all sorts of time.

Why procedurally generated worlds?

Hand modeling environments is a huge time sink. Modeling is fun, yes but only single people without children have time for that.
Higher levels of detail and larger worlds.
More interesting. By using various algorithms, (perlin, fractal, organic, recursive, etc), we will be creating strange environments that even we the creators are eager to explore.

Note to self: NVIDIA has a really interesting article demonstrating procedurally generated landscapes using the "marching cubes" algorithm.

Yet we still need a way to place objects within the world. Ladders, puzzles, etc. There will need to be some sort of world-creator tool that allows you to interactively fly through the world and add stuff.

The World Editor

I envision an OpenGL program which allows you to craft your procedural generation script and see the results of your changes in realtime. When you tweak the script a new mesh is generated at the current viewpoint. As you wait longer the surrounding mesh patches are generated outward (with lesser polygon counts). You can then fly around and place viewpoints and props.

The Game Engine

Such an expansive game that focuses on exploring really deserves a real-time 3D renderer. At the same time it needs to be rendered in great detail because beauty is one of the primary motivations of a game like this (for me at least). Sadly, I don't think current graphics cards are ready for this level of detail. Also, I want the game to be enjoyable by everyone, not only those with "gaming rigs" which heat an entire house. They Myst series had very modest system requirements which surely contributed greatly to it's popularity.

Perhaps a compromise can be made between real-time and pre-rendered graphics. What if you render each viewpoint on demand? This has the advantage of giving the player freedom to explore and interact with the world in unanticipated ways. The question is, can we render viewpoints fast enough? How fast is fast enough? 250ms? Certainly we'd need to leverage the graphics card to some degree.

Let's call this post-rendering. (Perhaps there's an official term for it.) Here's what a high quality post-renderer would need to be capable of (in order of priority):

Very high polygon counts - many millions on screen at a time.
Ambient occlusion - it adds a huge level of realism
Bump mapping - Also very important for realism
Shadows
Reflections

Can we do this at 4 frames per second on low-end graphics hardware? I'm doubtful but willing to give it a shot. I'll use my budget laptop to define "low-end". It has Intel GMA-4500 graphics with a dual-core "Pentium" (not Core2). I'll start by experimenting with OpenGL shader language (GLSL).

If it works I'll be liberated from a lot of the pains of pre-rendered graphics such as:

If you decide to tweak your models you must re-render every viewpoint it was visible in.
When the player moves an object in the world you need a second rendered image for every viewpoint in which the object is visible. Did you ever wonder why all the doors in Myst games close automatically behind you? It's because they didn't want to render two images everywhere the door was visible.
Interactivity with the world must be kept to a minimum because of the above problem. Consider a control panel with a switch and a lever. That's 4 combinations of positions. If the control panel is visible from 3 different viewpoints we need to per-render 12 images! Yuck.
Deciding which viewpoints to render is a time consuming. Care must be taken so that moveable objects are occluded from view.
Video overlays are difficult when using panoramic viewpoints.

I'm eager to start experimenting with procedural landscape creation and post-rendered graphics, however, my first order of business is to make an attempt at the standard-GUI concept I outlined previously.

P.S. perhaps a GPU based voxel engine would give the level of detail I want? Needs research.

Wednesday, February 29, 2012

Joining the Journeyman Project Tribute Team

I was contacted recently by Andy Curry, the project leader for a remake of The Journeyman Project game. Their needs seem to align with the goals of my Aware Engine so I have agreed to collaborate closely with them.

Andy is working on some cubic panorama renders that we can use to create a first tech. demo. For now I'll be working on:

Supporting cubic panoramas in Aware Engine. This should be easy as my envmap_viewer can already do it. Perhaps the challenge will be minimizing the appearance of hair-line seams where the cube faces meet. Some video cards seem to be more prone to this anomaly than others.
Getting Aware Engine compiled for the big three platforms (Win, Linux, Mac). This is always a big time sink. Eventually I'm thinking I'll even bundle JamVM to insulate us from Java installation and misconfiguration woes. But not yet.

Sabbatical Complete

I have completed my one year break from the computer. Although technically it won't be a year until March 14th I figure I'm close enough (for reasons you'll see in an upcoming post).

My primary objective was to be fully present to my new son. To this end the sabbatical was a resounding success! I truly feel I know my new son more deeply, and am closer to him that if I had been in absent-minded-professor mode.

I must confess however that I was not as strict as I originally envisioned. As the months went by I convinced myself that general purpose internet surfing was okay. I did, however, maintain the will power to stay away from "neat" computery topics and instead just do facebook, email, and read about rock climbing. Its not too regrettable because I limited this computer usage to when he was asleep or otherwise occupied.

Honestly, the demands of both time and energy of parenthood has left little time for anything else in my life. Parenthood changes your life! As it well should. I think these early years of parenthood will continue to be a sort of forced break. Only now, I'll allow myself to spend my precious little free time on computer projects.