Tuesday, May 31, 2011

Passing input events

Today I finished up the basic windowing code.  The window procedure is contained in a window manager, which also handles registering the class and creating the window.

I also did some research into the definition of RAII to try and work out what I should be calling a particular technique I have been using.  I'm still not entirely sure if the technique I am using is valid RAII, but I did find a rather good explanation of what RAII is.  You can find the article over at The RAII Programming Idiom.

I am currently working out how I am going to pass user input events to the game state.  I think I might revisit the idea of a "systems layer", since I could easily give the input gathering systems a pointer to the current game state pointer.  Without using a systems layer, I would need to pass the pointer to the initialization state, which feels a bit awkward.  I am planning on using a pointer to a pointer because the game state could change, thus changing the pointer.

Friday, May 27, 2011

Researching airplane controls and state switching

Most of today's work was spent in researching airplane controls for the new test project.  I don't have much to say on the matter, but here are a couple resources which may be of interest:
Cockpit of the Cessna 172 Skyhawk
aircraft cockpit controls and instruments

As far as code goes, I finished the basic state switching code.  The current state is switched by returning an input object for the new state from the current state's update function.  This input object is used as the input for the new state, allowing the previous state to pass on information such as the character the player selected, open network connections, Direct3D device, etc.

I also finished the window class wrapper. I am actually wondering why I have the registration code inside of the wrapper's constructor rather than having external code directly alter the wrapped class atom...  I suspect this is RAII, but I do not know for sure.  I plan on researching RAII again tomorrow (I keep forgetting the exact definition of that term).

Thursday, May 26, 2011

COM interface pointer wrapping

Today I created the repository for the new test project and started on the framework.

One of the things I want to try out in this project is a better wrapper around COM interfaces.  Previously, each interface had its own wrapper class which handled creation of the associated object, as well as destruction.  This was a bit difficult to work with though, and caused the code to look like "m_dev->m_dev->Present()".

The new wrapper I am going to try out will be generic by way of a template.  One really nice thing about this new design is that I will be able to pass the wrapper directly to a function which is expecting a normal COM interface pointer.  This will be done by overloading the address-of operator to return the address of the interface pointer being wrapped.  With the old design I had to make such calls inside of the wrapper, which lead to some really rigid code.

Another thing I will be looking into is hooking the window to process input messages separate from the windowing code.  I have never liked having the WM_KEYDOWN message processing inside of the window manager.

Wednesday, May 25, 2011

Character jitter and a new test project

The character was jittering back and forth when it walked, which I originally assumed was due to the large number of lights I was testing with.  However, when I removed all of the test lights, the character was still jittering.  After a bit of debugging, it looks like the bug was caused by the camera update being performed before the character's position was integrated.  This caused the camera to point at where the character used to be, not where it current was.

The artist GreyKnight has been creating models and textures for this test project I am working on.  However, this is taking time away from creating artwork for the SHMUP so I have decided to switch test projects a bit sooner than I was originally expecting.  This new test project will use art assets which the SHMUP will eventually need (please note that the test project will be different from the SHMUP though).

Tuesday, May 24, 2011

Cube shadow maps and a bug in PIX

Originally I was rendering the scene with a given shadow map to a temporary texture.  This texture was then rendered additively (alpha blending ONE, ONE, ADD) to the default render target in order to accumulate the various light and shadow map effects.  The reason I was rendering to a temporary texture rather than the default render target directly was that the additive rendering would cause part of the floor to show through your character (and other similar problems).

Promit in the #gamedev IRC channel pointed out that you can solve this by filling the default depth buffer in a pre-pass (this pass would also include adding any ambient lighting) and then disabling writes to the default depth buffer and setting the z compare function to "equal".  This causes only the pixels on the top to be rendered additively in subsequent passes, so you don't get any bleedthrough where there shouldn't be any.

After making that change, I moved on to implementing omni shadow mapping using a cube texture rather than 6 explicit textures.  For details on implementing this, I recommend these two articles: Cubic Shadow Mapping in Direct3D and GPU Gems - Chapter 12. Omnidirectional Shadow Mapping.

One really annoying "bug" I encountered turned out to be the fault of PIX.  While debugging the cube texture version of the shadow mapping code, I discovered (through PIX) that the depth buffer being generated in the pre-pass was empty.  I eventually narrowed the problem down to a single render state: D3DRS_COLORWRITEENABLE.  When I set this render state to 0, writing to the depth buffer appeared to be turned off as well.  Enabling all color writing also appeared to enable depth buffer writing.  This made no sense at all, and I was starting to suspect that it was an obscure problem with the ATI Radeon X1000 series (again).  Fortunately, I discovered the real problem quite by accident; apparently when you disable color writing in your application, it is also disabled in PIX.  So when PIX goes to display the depth buffer, nothing is actually displayed.  The depth buffer can be seen perfectly fine after enabling color writing, but this is hardly an ideal situation.  Oh well, considering the software is free I can't really complain too much.

After using a cubic shadow map and a depth pre-pass I was able to get 6 omni lights with shadow mapping to be displayed on the screen at about 28 FPS.  Most situations won't call for many omni lights, so I can probably work with this restriction.

Monday, May 23, 2011

Shadow Mapping Works! Devil's details and screenshot included

Shadow mapping works!  It is rather inefficient at the moment, but it does indeed work.

I think it's about time for another issue of "The Devil's Details", this time regarding shadow mapping.

Pixel shader doesn't know its own depth
You do not have access to the depth of the pixel you are working with in the pixel shader.  Because of this, when creating the shadow map you need to calculate the depth in the vertex shader and interpolate that value across the primitive.

No built-in method to go back to the default render target
After changing the current render target to create the shadow map, you want to switch back to the default render target so that you can render the scene from the camera's view.  Unfortunately, you cannot do this simply by passing NULL to SetRenderTarget().  To get around this problem, keep a reference to the default render target by calling GetRenderTarget() during initialization (or at least before the first call to SetRenderTarget()) and pass it to SetRenderTarget() when you want to switch back to it.

D3DCOLOR_ARGB accepts inputs in the range 0-255, not 0-1
This can catch you off guard, (especially late at night after working all day trying to get shadow maps to work) because colors in the shaders are 0-1, not 0-255.  Just remember this and you won't be confused trying to figure out why the shadow map isn't being cleared to white.

Transform from projection space to texture space
When calculating the location of the pixel in the shadow map for finding out if it is in shadow, you need to perform an additional transformation after the normal light view-projection transformation.  After the projection transformation (and homogeneous divide) you are in a coordinate space ranging in x and y from -1 to 1.  Texture coordinates have their origin at the top-left and range from 0 to 1, and the y axis points downwards (I am using a left-handed coordinate system where y points up).  Because of this, you need to translate the positions by (1, 1) and divide by 2, then negate the y coordinate.  You can do this before the homogeneous divide, so these transformations can be rolled into the light's view-projection matrix (just make sure you only do this for the lighting stage, not the shadow map creation stage).

Do not perform the homogeneous divide in the vertex shader!
I'm still not sure why this is the case (probably a problem with interpolation), but when you are doing the lighting stage (after the shadow map has been created), performing the homogeneous divide on the pixel's shadow map space coordinates must be done in the pixel shader.  I wasn't doing this originally, and the character's shadow on the ground was extremely thin and angled sharply away from the character.

Render targets and depth buffers are best in pairs
At least when the render targets are of differing size.  One of the most annoying problems I encountered while writing the shadow mapping system was that the entire shadow map was not being rendered to.  The shadow map I was using was 1024x1024 and the window was 800x600, which means that the default depth buffer was also 800x600.  Apparently if a pixel in the render target is outside the bounds of the depth buffer, the z-test automatically fails.  To fix this problem, a 1024x1024 depth buffer needs to be created along with the shadow map, and set when the shadow map is set.  There doesn't appear to be a built-in method to set the default depth buffer after you have set a new one, so in a similar fashion to a previous devil's detail you will need to keep track of a reference to the default depth buffer yourself.

Friday, May 20, 2011

Learning shadow mapping

I read through the section in RTR on shadow mapping and I think I know enough about the basics to get some kind of shadow system into the test project.  Shadow mapping involves rendering the scene's depth information from the light's perspective.  You then render the scene from the camera's perspective, detecting if a pixel is in shadow by transforming the pixel's world coordinates (passed to the pixel shader as an interpolated value from the vertex shader) to the light's clip space, projecting onto the depth map and comparing the depth of the pixel with the depth value in the depth map.  If the pixel you are drawing has a greater depth value (assuming z is directed into the screen) than the value in the depth map, it is in shadow since it is behind whatever caused that value to be placed in the depth map.

Ambient light is probably achieved by making a normal render pass, only including ambient lighting, before you do the light shadowing passes.  Now that I think about it, multiple lights are likely handled this same way; perform a set of 2 passes per light, one for creating the depth map and one for filling the color buffer.  Each lighting pass would add to the color in the color buffer.

I should note that the above is a mixture of things I have read and "filling in the blanks".  I don't have any actual experience with shadow mapping yet, so I may find that it's not quite what I am expecting.  A perfect example of this is with creating the depth map.  I assumed that an actual depth buffer was used, which would be set as a texture for the camera rendering pass.  However, it appears as though Direct3D9 does not let you access the depth buffer like that.  Instead, you need to create a texture and set it as the render target, and output depth values rather than colors from the pixel shader.

These are some resources which I found while researching shadow mapping.
Soft-Edged Shadows
Cubic Shadow Mapping in Direct3D

Thursday, May 19, 2011

Lighting works! Screenshot included

Lighting works!  Per-pixel lighting was easier than I thought; it is literally just moving the lighting calculation from the vertex shader into the pixel shader.

If you want to learn how to do lighting, I recommend Real-Time Rendering, specifically chapter 5.  From what I have seen, the book goes into great detail on the subject of lighting in later chapters as well.

I currently have experience with two types of lights: directional and omni.  A directional light is basically a light which shines everywhere in one direction.  Directional lights are commonly used for the sun, and other far-off light sources.  An omni light is a type of point-light; it shines from a specific point in all directions.  These would be used for things like torches, fireballs, sci-fi weapon projectiles, lamps without a shade, etc.  There is another type of point-light called a "spotlight", which only shines within a specific set of directions (a cone).

Lighting requires two properties from a material: specular and diffuse.  Currently, the specular property is constant over the entire model.  Textures supply the diffuse property on a pixel-by-pixel basis, allowing the artist to finely tune how the model will look.  A texture is pretty much a diffuse map.

Today I learned about a couple things you want to watch out for when setting shader constant registers.  The first is that if you are passing in an array all in one call to Set*ShaderConstant*(), each array element must be packed to align with a 4-float/int boundary.  I tried passing in an array where each element was a float3 and I was initially rather confused as to why it wouldn't work.

The second thing I would like to point out is that there is apparently a separate set of registers for integers.  If you need to pass an integer to your shader, make sure you are using Set*ShaderConstantI() and not Set*ShaderConstantF().  This one took me a while to figure out, which was a little embarrassing.

I am now researching shadows.  Unfortunately, it appears as though creating shadows from omni lights is a bit of a black art...  It involves creating 6 shadow maps and cube mapping with them or something.  I should have more details tomorrow.

Wednesday, May 18, 2011

Multiple animations

Multiple animations are now supported.  Previously, only the animation stored in the model file was playable, so the character had no idle animation (or pose).  Because there was no pose for being "idle", I jury rigged the animation code to zero the bone position/orientation if the player didn't want the character moving.  This resulted in the character springing back to its "arms-wide and legs straight" modeling "pose", which looked really awkward as you moved around the room.

Each animation is stored in its own model file, so I created another array of models but called them "animations" instead.  I also added an animation ID to each instance.  Changing the current animation is as simple as changing the animation ID and resetting the animation time to zero.

Now that I have animations more defined, I created separate "skinned" and "not skinned" vertex shaders.  I'm not really looking for a speed increase with this change, just trying to separate things more so that I can understand them easier.

I also started researching lighting.  Lighting is somewhat confusing, but one interesting thing I learned was the definition of diffuse and specular colors.  Diffuse color is from light which has been partially absorbed by the material, bounced around a bit, and finally shot back out.  Specular color is from light which has bounced right off the material.  I'm a bit confused as to why each material can have its own specular color, since it should be the color of the light which bounced off, but I'm sure I'll figure that out eventually.

Tuesday, May 17, 2011

Current progress screenshot

Today I added in support for materials, multiple models, and instances.  A picture is supposedly worth a thousand words, and I seem to be at a loss for anything really interesting to say today, so here is a screenshot of the current "demo" I am working on:

Monday, May 16, 2011

Storing animations, converting RH orienations to LH, and world-space cursor

Now that I have a single animation working, it's time to start worrying about how to use multiple animations.  The Milkshape3D file format only supports a single animation, so I did a bit of investigating to find out how games normally store multiple animations.  From what I could tell, a lot of people just roll their own format.  I'm not interested in doing that at the moment, so I started discussing solutions with our artist GreyKnight.  What we eventually decided on was to have a base file which stores the geometry and base pose.  Each animation would be stored as its own "model", containing only keyframe information.  This system is a bit rigid in that adding/removing a bone would be really painful (no pun intended), but it should work well enough until I find a better alternative.

I finally discovered why my method for converting orientations from RH to LH seemed wrong.  Turns out it was!  My method (found through trial-and-error since what should have been the correct method wasn't working) was to invert the z-axis orientations when positions along the z-axis are inverted for coordinate conversion between RH and LH.  If you draw this out, it doesn't make much sense; the z-axis orientations are correct between RH and LH (when the z-axis is inverted to do the conversion), but the x and y axis orienations are in reverse.  In other words, the one orientation which should have been constant between the two coordinate systems was the only one which needed to be changed.  Well, while poking around in the math code I noticed something rather odd: the rotation matrix creation function negates the angle before it is used to create the matrix.  So all x and y axis orientations were being inverted automagically, and the only way to prevent this from happening to z-axis orientations is to invert them before they are sent to the creation function.  I suspect this was a left-over from when I was creating the camera code, since the angles needed to be negated.

Today I managed to make it so that when you click on the screen, the game can detect where you clicked in the world.  This involved a ray intersection test with the world geometry, something I had never done before.  In order to define the ray, you need a point in world space and a normal vector pointing in the direction of the ray.  The easiest of these to calculate is the point, which should be right on the "lens" of the camera where your mouse cursor was when you clicked.  To get the point in 3D world space, you need to transform the screen-space coordinates of the cursor into clip space, which means reversing the viewport mapping and homogeneous divide.  This step can be a bit tricky, since you need to work in the world-space z-coordinate.  Once safely in clip space, you can transform into world space using the inverse of your view and projection matrices.  Calculating the normal was a bit more tricky.  My original idea was to transform the vector (0, 0, 1) from screen space into world space, but that doesn't work because it needs to be attached to a point other than the origin in order to be affected by the pyramid shape of the view frustum, which leads us naturally to the solution I ended up using; transform a second point from screen to world space, one which is at ray_point+(0, 0, 1)  Once in world space, you can get the ray normal by normalizing the vector from the ray point to the second point.

Saturday, May 14, 2011

Optional Weekends

Weekends are rather unpredictable for me.  Sometimes I will be able to get a normal work day's worth of coding done, and sometimes I won't be anywhere near the computer.  Today is a perfect example of the little-to-no work day: I just made the animated model move along the z-axis to see how it would look.

Because of this, I have decided to make weekend blog posts optional.  There is no point in posting a single-sentence entry.  This of course won't affect work-week posts, so you can expect to get at least 5 blog posts from me per week.

Friday, May 13, 2011

Skinning works! The devil's details.

I finally have multiple-bone skinning working!  I think I can summarize the last few days as "Wow, that was a pain".  On the surface, a hierarchical animation system sounds easy: "Just go down the tree and accumulate matrices as you go", but as is usually the case, the devil is indeed in the details.

I would love to make this blog post a full tutorial on skinning, but unfortunately I still don't completely understand it myself.  I slapped in some foreign code for nasty jobs so that I could just focus on getting the bones to animate and stay attached to the skin.  Also because the code was written by accumulating many tests, it could be improved quite a bit.

Here are some of what I will affectionately call "the devil's details".  Hopefully these will help someone who is learning to write a skinning system.  Keep in mind that some of these might only apply to the Milkshape3D model format.

Not all vertices will be attached to a bone
While this might not be the case in most animations, you should make sure that your system won't explode when this situation is encountered.  One of the animations I tested my code on was a brick hurtling  towards a brick wall, bouncing off the wall and falling onto the ground.  The wall was part of the animation, but was not attached to a bone.

Keyframes are defined per-bone, not globally
Rather than having a list of keyframes which each have a list of the bones that they affect, each bone has a list of keyframes.

Calculating keyframe indices; not as easy as you might think
The bones are animated by interpolating between two keyframes, so you need to figure out which two keyframes you are in between.  You also need to gracefully handle the situation where the current animation time is before the first keyframe for the specified joint, or past the last keyframe.  I am currently finding the second keyframe first by looking through the list of keyframes going forward in time and checking for the first keyframe which comes after the current animation time.  I then set the first keyframe to be the one previous to the second which I just found.  Both keyframes are initialized to the last keyframe, which is the value they take on if no other keyframe is found in the loop.  If the second keyframe is found to be index 0, the first keyframe will become negative, in which case the first keyframe is set to be equal to the second.  There are two cases where both keyframes will have the same index; if the time is before the first keyframe, or if the time is after the last keyframe.  If both keyframes have the same index, interpolation is not required.

Keyframe positions and orientations are relative to the bone's original position and orientation
This was one of the big ones for me.  I did not figure this out until well into development, so I had to basically rewrite the skinning system.  This means that in order to transform a point to where it should be for the given keyframe, the point must be in the same coordinate space as the bone.

Convert orientations from RH to LH, not just positions!
This was the big one.  Because of the previous "detail", things can become really out of sorts if you get rotation wrong.  If a parent joint is in a certain orientation, all movement is going to depend on that orientation being correct.  If the orientation is wrong, the vertices attached to the joint will be in the wrong spot.  This problem gets worse and worse the further down the hierarchy you go.  As far as converting the orientations goes, I'm not entirely sure how it works.  What ended up working for me was to invert z-axis positions and z-axis rotations, but I have read elsewhere that only the x and y axis rotations should be inverted when z-axis positions are inverted.  I'll have to do more research into this, but it is working fine for now!

Bone matrix, what is it?
This had me confused for a while.  Basically, the bone matrix needs to transform a vertex in model space (which is attached to the bone) into the coordinate space of the bone (without any animations), apply the keyframe transformation, then transform back into model space.

Wednesday, May 11, 2011

Radeon X1000 actually NOT SM3.0 compliant?!

Today I worked on skinning.  Skinning is the term used for 3D bone-based animation, and comes from the idea of putting "skin" on the bones.

My initial plan for skinning was to store all of the joint matrices for each model instance in one big texture.  The advantage to this is that I can fit a lot more instances and bones into one draw call than I could if I was using constant registers; the constant registers would run out rather quickly.

The skinning system required me to create a texture manually for the first time, and I learned something rather interesting:  You cannot directly access a texture which is in video memory.  In order to put data into a VRAM texture, you need to create a second texture in system memory, fill it with the data, then update the texture in VRAM using IDirect3DDevice9::UpdateTexture().  Another interesting thing to note is that apparently a texture is a collection of surfaces.  I originally thought it was just one big 3D array of data (for the different miplevels), but this does make more sense since each mipmap will be a different size.

Accessing the texture from the vertex shader was a bit more problematic than I was originally expecting.  The first issue I encountered was that you cannot use tex*D() to access the texture (at least not in D3D9, it might work in D3D10+).  The reason for this is that in a pixel shader, the miplevel is calculated based on the texture coordinate partial derivatives.  Basically, if the texture coordinates changed a lot since the previous pixel, a lower miplevel is chosen in order to keep the pixel:texel ratio as close to 1:1 as possible.  The derivatives can't be calculated in the vertex shader though, because we are working with vertices and not pixels.  So to get around this, you need to ask for a specific miplevel by using tex*Dlod().

That should have been the end of the VTF (vertex texture fetch) problems, but then Present() started failing...  I couldn't find anything wrong with the code, so I enabled debugging through the DirectX Control Panel to see if I it would tell me anything about the error.  Apparently the driver was failing internally, but no further information was being given.  I asked for help in the #gamedev IRC channel and we eventually figured out the problem.  Apparently the Radeon X1000 series of graphics card, of which mine is a part of, does not support VTF.  The really odd part of this is that one of the selling features of this series was its supposed support for SM3.0, which should have included support for VTF.  ATI have a workaround for the problem, called "Render to Vertex Buffer", but at this point, using constant registers is looking better and better.

In the end, I decided to scrap the instancing code and use constant registers for skinning.  At least the constant registers work like they should, so I can actually get 3D animation up and running.

I currently have single-joint skinning working, but the code for finding the keyframes to interpolate between is a real mess.  I have one or two ideas on how to fix this, so we'll see how that goes tomorrow.  It is kind of cool to see animation working, even if it is only a ball moving up and down.

Tuesday, May 10, 2011

RH to LH, texture coord oddities, and instancing

MS3D model geometry is now loaded correctly.  One problem I did encounter was that the models are stored in a right-handed coordinate system, but the game uses a left-handed coordinate system.  This means that the x coordinate is pointing in the wrong direction, basically.  I initially tried to solve this by inverting the x coordinate of all vertices in the model.  I was a bit surprised to find that the model appeared to have turned itself inside-out!  After a bit of discussion in the #gamedev IRC channel, I figured out what the problem was:  The triangles were wound in the wrong direction.  This was fixed easily enough by reversing the order of the vertices of each triangle, and the model was no longer inside-out.

While adding in texture support, I discovered some rather odd things.  First off, the U and V texture coordinates are named "s" and "t".  After doing some poking around, I think this is an OpenGL thing.  Trying to find the texture coordinates was a bit confusing at first though.

The second odd thing is that texture coordinates for each vertex are stored per triangle, not per vertex.  This means that a single vertex can have multiple texture coordinates.  I suppose this makes sense, since you may want two triangles which share a vertex to have completely unrelated textures.  It did goof up my usual way of doing things though, and I had to scrap my indexed drawing code since each vertex of each triangle needs to be stored.  There may be a way for me to split up the geometry data and the material data using stream frequencies, but further investigation along those lines will have to wait for the moment.

Texturing isn't completely working yet.  Coordinates are stored fine, but the same texture is used for every model.  Also, no other material properties are currently in use.  I will probably get these things working correctly after skinning is in.

The method of instancing which I am using involves separating the model data and per-instance data into different vertex buffers.  The model data is stuff like vertex position, texture coordinates, etc. and per-instance data is the world transformation matrix and anything else which differentiates instances of the same model.  Stream frequencies are then setup so that one "vertex" of instance data is applied to the entire model.

This was my first time working with multiple streams, so I suppose it's not surprising that I made a newbie blunder.  After the instancing code was all finished, the model wouldn't appear.  I spent some time trying to figure out what went wrong, but everything checked out...  I eventually found the culprit though:  When creating the vertex declaration, I didn't start the offset over at 0 when I started specifying stream 1 elements.  This caused all of the "vertices" in stream 1 to have a chunk of padding at their beginning equal in size to the stream 0 vertex size.

Monday, May 9, 2011

Started loading models. Matrices and shaders.

We use Milkshape3D for modeling, so it makes sense to support loading .ms3d model files.  I implemented basic loading code this morning, but it needs a lot of work.  In its current state, it will only load a single triangle out of the model, and materials and animation are not taken into account.  After I get some other parts of the rendering side of things finished, I plan on getting it to correctly load and display a cool looking 3D model that GreyKnight created.

The rest of the day was spent in getting the camera and the projection matrix working.

Something you should look out for when using a vertex shader to transform vertices: When the matrix is stored in the constant registers, by default each register represents a column of the matrix, not a row.  This is called "column-major", and makes sense from an optimization point of view; if you are using row vectors, then the columns of the matrices will need to be accessed to perform transformations, and the rows will rarely be directly accessed (one exception would be in matrix multiplication).  You can account for this by transposing your matrices before you place them in the registers.

To make the code more readable (and easier to write), I have created four specialized derivations of the generic matrix: RotateX, RotateY, RotateZ, and Translate.  The names are pretty self-explanatory; RotateX represents a rotation about the x-axis, etc. and Translate represents a translation.  This allows me to create a rotation or translation matrix by providing only a single argument (an angle or a vector, depending on the matrix).

On the subject of matrices, I have decided to use mOut = Transpose(mIn); rather than mOut = mIn.Transpose();  The latter implies that the matrix being transposed is being altered, which isn't the case (a copy of it is created and transposed).  I don't much like mOut = Multiply(mL, mR); though, as this is a bit too messy in my opinion.  For multiplication and similar operations, I have decided to use this format: mOut = mL*mR;

Sunday, May 8, 2011

Anti-cheat shader works

I finally have the anti-cheat shader working!  These are the results (you may need to increase your monitor's brightness to see the difference, as this is what the shader is designed to combat):

The fixed version is to the left, and the original is to the right.

Ironically, I kind of cheated to get these examples (it is a simple quad with a screenshot slapped ontop of it), so I can't see how it looks interactively.  But I plan on using this for future projects which require this kind of protection, so I should have an excuse to try it out eventually (probably during one of these quick test projects I will be creating).

P.S. Adding images to a post on Blogspot is somewhat annoying.  I had to restructure this post several times before it was formatted correctly.

Saturday, May 7, 2011

Trouble trying to create banding, and an IDE for shaders

I spent most of the day trying to get the anti-cheating shader to work.  I still don't have it working quite right, but it's getting closer.  The idea is to create worse and worse banding the darker something is, thus blurring any details.  I am able to create banding just fine, but making it vary depending on the brightness is proving to be a bit troublesome.

If you are looking for an IDE to write shaders in, I recommend RenderMonkey by AMD.  It supports both Direct3D and OpenGL shader languages, and shows you a preview of the shader you are writing.  Intellisense and syntax highlighting are also available.

Friday, May 6, 2011

HLSL and a cool Direct3D debugging tool

I worked on the rendering system today.  Because of the way I handle errors, there is a class for the D3D object, the device, vertex declarations, vertex buffers, vertex shaders, and pixel shaders.  This is a little inconvenient, as the following code shows:

hResult = m_renderer->GetDevice()->DrawPrimitive(D3DPT_TRIANGLELIST, 0, 1);
if (hResult != D3D_OK)
    throw (CException("Could not draw the triangle.\r\n\r\nError Code: 0x%08X", hResult));

I'm considering having the classes manage global variables rather than member variables.  That way, I can access the global directly in other parts of the code rather than through an object.  I may try that in the next iteration.

Today marks the first time I have ever written and used a shader.  This is something I have been wanting to do for a while, and it opens up some very interesting possibilities.  For example, I can now make the screen fade to grey.  I can also prevent players from being able to get around the darkness in a game by banding dark colors (I am currently working out how to do this exactly).

One thing which was a bit annoying for me while researching HLSL (a programming language you can use to write shaders) was being unable to find a simple, basic shader example.  So here is a set of bare-bones, pass-through shaders which will hopefully help someone who wants to learn how to write shaders:

// ***** Vertex Shader *****
struct tVSInput
    float3 pos        : POSITION;
    float4 diffuse    : COLOR;

struct tVSOutput
    float4 pos        : POSITION;
    float4 diffuse    : COLOR;

tVSOutput main(tVSInput input)
    tVSOutput output;
    output.pos.xyz = input.pos.xyz;
    output.pos.w = 1;
    output.diffuse = input.diffuse;
    return (output);

// *****

// ***** Pixel Shader *****
struct tVSOutput
    float4 pos        : POSITION;
    float4 diffuse    : COLOR;

float4 main(tVSOutput input) : COLOR
    return (input.diffuse);

// *****

I should probably note that this is my first time writing these, so an expert may take issue with something in there.  However, both shaders work perfectly fine in the tests I have ran, and they are nice and simple to read.

On the subject of shaders; if you use a shader model 3.0 vertex shader, you apparently need to also use a shader model 3.0 pixel shader.  If you use a 2.0 vertex shader (or lower I am assuming), you can let the fixed function pipeline take care of the pixel shader part.  Before I figured this out, I was a bit confused as to why the triangle I was using as a test wasn't appearing.

If you are using Direct3D, I highly recommend trying out PIX.  PIX is a free utility that comes with the SDK.  You can find it in the utilities folder in the SDK installation directory.  It is pretty much a debugger for Direct3D; it monitors how the program you attach it to uses Direct3D and gives you a report after the program closes.  It gives you a list of all created resources and the arguments used when creating them, you can take snapshots at runtime and later view what calls were made during that snapshot, and quite a bit more from the looks of it.

Thursday, May 5, 2011

Smart pointers and error handling

Today I worked on the systems.  Like I mentioned in yesterday's blog post, I have decided to take a reference counting approach to this problem.  My original thought was to create and destroy the systems from within the application layer, and share these with the game states.  With the new design though, there is a game state called "Initialize" which creates the systems.  These systems are then passed between game states via their input structures.  If a system isn't passed, the reference counter will notice that nothing is referencing it anymore and will remove it.  This makes shutting down simple since the systems will be destroyed when the last game state is left.

While I was at it, I decided to wrap the current state interface and input structure into smart pointers.  These pointers are only used within the main loop, and there is only one copy of them, so I used a variety of smart pointer called "scoped pointer".  A scoped pointer only allows one reference to the pointer, which means that creating a copy of the pointer, assigning the pointer to something, and passing the pointer as a byval argument to a function is not allowed.

My main reason for using smart pointers is that I use exceptions for error handling.  One of the big drawbacks (in my opinion) to using exceptions for error handling is that everything which must do something before the program is shutdown (allocated memory must be deallocated, window classes must be unregistered, DirectX COM interfaces must be released...) needs to be placed inside of an object.  When an exception is thrown, the destructors of created objects are called, thus allowing you to clean things up.  Using smart pointers is a generic way of doing this for allocated memory.  Things such as a window class or a DirectX COM interface need specialized classes though.

Wednesday, May 4, 2011

Cellular automaton, raw image data, and game states

Cellular Automaton

I learned a new term this morning: "Cellular Automaton".  It's basically a grid where something can be in each grid cell and the state of that something depends on certain rules about the surrounding grid cells (its neighborhood).

An example of cellular automaton being used is one of those games where you have to dig tunnels through the dirt, and there are various objects in the dirt which can fall down if you dig out a hole under them (these objects can even crush you depending on the object and the game in question).  Usually your objective in these games is to gather gems and avoid being crushed by boulders.  The motion of the boulders as they fall can be controlled as a cellular automaton.

Controlling the falling motion of boulders is fairly simple and straight-forward (or downward in this case).  A cellular automaton requires a grid.  These games are usually tile based, so we can check that off the list.  The second requirement is that each cell needs a finite number of states.  In our case, the states will be "boulder", "dirt" and "empty".  The third requirement is a set of rules need to exist which control the state of a certain cell based on its neighborhood (usually the immediately surrounding cells).  We want the boulder to fall down until it hits part of the map which hasn't been dug out yet, so the following rule should suffice: If there is a boulder in the north cell and the current cell is empty, remove the boulder from the north cell and place it in the current cell.

To step a cellular automaton forward in "time", iterate over all of the cells and apply the rules to each cell as it is visited.  Depending on the rules you are using, you may want to iterate in a certain order.  For example, it would be a good idea to iterate from the bottom up so that two boulders in a stack fall down at the same time (there would be a gap in between them if you went from the top down).

A very interesting application of this technique, which is discussed in "An Intro to Cellular Automaton", is to simulate fluid flow.  After simulating a boulder falling down, it's not that much of a stretch to simulate water falling down and spreading out when it hits (you just need more states to include how much water there is).  This is actually very similar to a technique our designer/artist GreyKnight mentioned last year.

Raw Image Data

On a completely different note, I had an idea earlier today for loading textures: Rather than storing them on disk as .bmp, .png, or other similar file formats, store just their raw data ready to be loaded directly into a Direct3D texture.  This would simplify the in-game loading code quite a bit if you are trying to avoid D3DX (like I am) since you don't need to put the various image format loaders into the game.  Doing this would require a "compile" stage of sorts before new images could be available in the game, but this may be worth it in the long run, especially if you are already planning on packaging all the game's files into a single database file.

Game State Changing

Back to actual coding progress; I spent most of the day researching, but I did manage to finish the first iteration of the game state changing code.  It works basically like this: When a new game state is entered, it is given input data.  This input data can be things such as the character which the player chose, the socket handle for an open connection to a server, etc.  Each game state has an update function which returns one of these input structures.  If the input structure is empty, the game state doesn't need to be changed.  If the input structure is tagged with a magic state ID, the program shuts down.  And finally, if the structure is tagged with the ID for another game state, the current game state is left and the new game state is entered with the structure as its input.  So to summarize; the output of the current game state is the input for the next game state.

I've decided to scrap the systems layer I talked about in yesterday's blog post. I am going to try a reference counting approach to the problem.


Some useful resources I found today:
Software optimization resources - A treasure trove of C++ and assembly optimization information.
Design for Performance - Sometimes vague, but a pretty good set of slides on getting more performance out of a game.
Mature Optimization - Discusses ways to design your code so that it stays more optimized throughout development rather than saving all optimizations for last.
An Intro to Cellular Automaton - This is the article which introduced me to the term "cellular automaton".  This is a good read, especially if you are interested in further information on the fluid simulation technique.

Tuesday, May 3, 2011

The layers of the framework for a game

Getting back into coding after a few weeks leave was a bit rough, but I managed to get the windowing and some basic game state code in place.

My plan is to split the code up into the following layers: Main, App, Systems, and Game State.

The main layer is pretty much just WinMain().  It handles the creation of the app layer and starts the main loop.

The app layer is responsible for the program's instance handle, the main loop, and the message pump.  It also handles creation of the systems layer and manages changing the game state.

The systems layer is responsible for setting up and providing access to the various required systems such as audio output, networking, graphics, file I/O, etc.  The systems are constants ie. new ones aren't created and old ones aren't removed in between game states.  All game states will require a window, rendering, file I/O, and the rest, so these are all setup once and access to them is granted through the systems layer.  This also makes changing game states and shutting down easier, since only a single thing needs to be shared or shutdown, rather than many things.

The game state is where all the interesting stuff happens.  Resources are loaded, swords and monsters are drawn and animated, damage is dealt, the GUI is drawn and managed, etc.  This will likely be broken up even further.

The systems and game state "layers" are actually more like components; one doesn't contain the other.  I may decide to use that terminology instead in the future.

Monday, May 2, 2011

Learning HLSL, and D3DX updates too often

If you want to learn HLSL, I highly recommend ShaderX2: Intro & Tutorials, specifically the article "Introduction to the DirectX High Level Shading Language".  The book, along with a couple of the other ShaderX books, has been made available for free over at http://tog.acm.org/resources/shaderx/ so you don't even need to purchase it!  You can also find that specific article on the MSDN at http://msdn.microsoft.com/en-us/library/ms810449.aspx.

I plan on avoiding D3DX in future projects.  There are of course the overhead and inflexibility concerns, but my main motivation is the fact that it gets updated much more frequently than DirectX.  Now, this isn't a bad thing if you are releasing a final product with an installer and everything, since you can package in the DirectX updater, but it is a real pain to deal with if you are just releasing demos.  The demos are usually packed up into a .zip file without an installer, which makes things easier on both the user and myself.  The problem I have encountered is that most users do not keep their copy of D3DX up-to-date, while I usually have the latest version.  This requires the user to install the update, either by downloading a separate "dependencies" file along with the demo or getting the update directly from Microsoft.  A lot of people don't want to bother doing that.

I only use D3DX for creating textures, so avoiding it won't be that problematic overall.  However, I will need to brush up on my bitmap format knowledge and figure out how to use libpng.

Today marks the last day of the "knowledge" phase of the R&D I am doing.  I will be entering the "experience" phase tomorrow, which involves actually using the information I learned.  The point of the "experience" phase is to get me more comfortable with the new techniques, and iron out the first big wave of unforeseen problems which inevitably rise up when doing something for the first time.  On a side-note, I suppose calling these the "research" and "development" phases respectively would be more fitting, but I like the more general feel of the other set of names.

Sunday, May 1, 2011

Placement new and memory allocation methods

While reading about memory management, I learned about something called "placement new".  "Placement new" is the new C++ keyword, except you pass a memory address to it for where you want the object to be constructed.  Placement new doesn't allocate memory for the object, so I suppose it is one way of manually calling the constructor of a class/struct.  If you construct an object in this way, you cannot destroy it by calling delete.  Instead, you have to directly call the destructor and handle freeing the associated memory yourself.

unsigned char *buffer = malloc(BUFFER_SIZE);
CSomeClass *object = new (buffer) CSomeClass();

 // Use the object for something...


This is mainly useful if you are creating objects from pre-allocated memory.

I also read about several memory allocation methods.  The methods I found interesting were the linear method, the stack method, and the buddy system.

The linear method involves having a line of memory which can be allocated in parts.  When data is allocated with this method, the data is placed at the beginning of the free space.  Subsequent allocations are placed directly after one another in memory, which is why this method is called "linear".  Unfortunately, you cannot deallocate specific allocations with this method.  The only other operation you are allowed to perform is a complete deallocation of everything in this line of memory.  Which brings us to...

This is exactly like the linear method, except it has one advantage: You can deallocate the most recent allocation.  This resembles a LIFO (last-in first-out) stack, where the last item pushed onto the stack is the first item popped off of it, and that is where it gets its name.

The buddy system is classed as a pool, which is why I am bringing this up here.  Pools allocate memory in chunks, and they have the advantage of allowing specific chunks to be deallocated.  Unfortunately, this deallocation can leave small holes in the pool.  Getting holes in a pool is called "external fragmentation" and can cause large allocations to fail even though there appears to be enough total free space.  This happens because the free space is in little pockets all over the place, and the large allocation can't fit in anywhere.  External fragmentation can be lessened by the allocation method and how free space is treated.  There is also another type of fragmentation called "internal fragmentation" which is caused by a chunk being too big for the requested allocation size.  The difference in size causes some wasted space at the end of the chunk.  Again, internal fragmentation can be lessened by the type of allocation you choose to use.

Buddy System
This is a pool-type allocation method like discussed previously.  The allocator has a list of acceptable chunk sizes (usually powers of 2) and keeps track of the free chunks of each size.  When data is allocated, a free chunk is looked for which can snugly fit the data.  If there isn't a free chunk of the required size, but there is one in a larger size, the larger chunk is split in half.  If the resulting chunk is still too big, the process is repeated until the smallest allowed size is reached.  If there are two free chunks right next to each other and they are the two halves of a split, they can be combined in order to allow for larger allocations later.  I am guessing the "splitting" system and having two halves of a larger chunk is why this method is called the buddy system.

I learned about virtual addressing as well, but I don't have much to say about it at this point.  The one thing which I found interesting is that it is possible to have a "miss" with it, similar to having a cache miss.  The miss happens when the CPU doesn't have the mapping information for the requested page.  This is called a "translation lookaside buffer miss".

These are some resources which I found very helpful:
Alternatives to malloc and new
The Memory Management Reference: Beginner's Guide
Start Pre-allocating And Stop Worrying
Virtual Addressing 101