Wednesday, November 2, 2011

P4 detect new files and changes

If you are a Perforce user, you might occasionally find you have been creating or editing files without being correctly synchronized with the Perforce server. It's always a mundane task to manually correct this. Recently, I made a script for checking a list of files for being both added to P4 as well as being checked out for edit.

It's important for me to make very clear: USE THIS AT YOUR OWN RISK. TEST IT CAREFULLY.  I've been using this a while, but when it comes to this sort of thing you can never be too careful.

The script below goes into a batch file that will, for each file:

  • Attempt to add it to perforce. If it's already added, it won't affect anything.
  • Attempt to check out the file for editing. If it's already checked out, it won't affect anything.
  • Revert any unmodified changes on the file. If you already had the file checked out and modified, it won't revert it.
The end result is every file in the list that is either unknown to Perforce server or has been modified without checking it out are put in the default change list for review and possible committing to the server. The filelist.txt is a list of full file pathnames, created by whatever means you prefer. I sometimes use "where /r . *.cs" or similar, depending on context, but most of the time this file is generated by part of my build process.

Here's the script:

for /F %%A in (filelist.txt) do call :process %%A
exit /b

p4 add %A
p4 edit %1
p4 revert -a %1

I hope someone finds this useful. 

The code in this post is subject to the terms of the Creative Commons Attribution-ShareAlike 3.0 Unported License

Creative Commons License

Tuesday, August 23, 2011

An easy way to create 1x1 textures for XNA.

I've noticed on more than a few occasions while reviewing XNA code for demos and game engines that people often need small textures in order to render diagnostic information. More often than not, they will create a small bitmap file and feed it through the content pipeline. It's no big deal, but there is a slightly easier way that is also better performing - just create a 1x1 texture directly, without using the file system.

Here's a function that does just that:

public static Texture2D CreateSolidTexture(GraphicsDevice graphics, Color color)
	var tex = new Texture2D(graphics, 1, 1, falseSurfaceFormat.Color);
	tex.SetData<Color>(new Color[] { color });
	return tex;
As a bonus, anyone using your demo, engine or what have you will have a slightly easier time using your code because they won't have to deal with migrating the sample content into their code space.

Tuesday, August 16, 2011

Notes about our new content pipeline features

The video below represents a personal milestone for me as it demonstrates the results of something I've been wanting to do for a very long time. 

Shortly after finishing Star Ninja, I decided the next game was going to be 3D. While I have done quite a bit of 3D graphics in the past, that was some time ago and I never really had a chance to work closely with HLSL much less all the many things that go into creating content that uses it. While trying to use the more conventional ways of getting the content I wanted in game, I found the available solutions were not compatible with XNA, not flexible enough or simply not affordable. Clearly, it was time to tackle that problem and get our own system for this because it's an important problem to solve.

I have since created a content pipeline that allows me to author content in Softimage, including HLSL code, and automatically process the data into a form that is usable by the game engine with minimal effort. Essentially, I can create a model, animate it, create (or reuse) HLSL shaders, hit export and it's pretty much ready to use in game. Creating content easily is an essential  feature of any modern pipeline and while this took a lot of work to figure out I am glad to have it because I will probably use this for quite some time. Some of the major features include:
  • Automatic HLSL semantic and vertex stream support.  
  • Support for multiple UV sets, vertex color and weight map data streams, which can be used as vertex streams or raw data for general game logic. 
  • Instanced model support. 
  • Skinned meshes.
  • Multiple animations.
  • Mesh optimization / vertex reduction.

The next game will of course use all these features so it is pretty neat to see the work result in something that  will nearly eliminate the error prone manual effort previously involved with getting 3D content into the game. Being able to author content that makes use of arbitrary HLSL code that doesn't require code changes to use in the game engine should also help me raise the visual fidelity of what I create by quite a lot.

While creating this system, I realized that I'm working in an uncommon place - the intersection of code and art. Most people do one or the other. I know I'm not a great artist, but by being able to enhance my limited artistic abilities with good tech I think I'll be able to create higher quality content than I would be able to do otherwise. For instance, if I want to create an asset with some special shader effect, I can just open the shader for that specific model and modify it; the exporter will propagate those changes and the run time will use the newly available data without any code changes required. This includes situations where the shader changes require a changes in textures or vertex stream for per vertex colors, weights. Pretty handy!

I'm really excited about the next project and what this new pipeline will do to help.

Monday, May 23, 2011

Abusing texel blur to smooth out DXT5 textures used by user interfaces

If the game has the memory for it, I generally prefer to use uncompressed textures for 2D UI elements because a compressed texture format such as DXT5 will almost certainly degrade the image quality.

An important part of making a good looking UI is supporting pixel-accurate rendering - this is especially important on smaller screens such as mobile devices where space is limited and small fonts & controls are common. If the UI is drawn a half-texel off or anything other than 1:1 pixel:texel, there will be blur that will make the UI appear less crisp. In some situations it may not be noticeable, but in some situations it's completely unacceptable; for instance small 8 point fonts that would have been clear become a blurry mess when the one-pixel-wide parts of a character texel lie on a screen pixel boundary. Even worse is when you have small pixels moving at sub pixel distances; there is simply no avoiding terrible strobing artifacts in this case as characters alternate between sometimes clear and mostly blurry while they move (I was reminded of this recently when working on the credit roll for Star Ninja). Not having pixel accurate rendering for UI is pretty much the same as telling the artist that no matter what they make, they have to put a blur filter on it before they save to make sure it is usable in sub pixel positions which of course would be ridiculous - but that's how it will look much of the time if you don't have pixel accurate rendering. I suspect this is why many games have UI elements that are bigger than they really need to be - it's just easier to ignore pixel accuracy and compensate by making everything big. Sometimes however, when UI elements are small, there is no avoiding the need for pixel accurate rendering.

Of course, when pixel accurate rendering is used the UI shows exactly what the textures contain, pixel for pixel. Uncompressed textures rendered pixel accurately look great, and it seems to usually be the best solution if your game has the memory for it. What if your UI elements are so numerous that it's out of the question to save them all as uncompressed textures? Or perhaps other factors are putting pressure on your overall memory use and you have to squeeze everywhere possible? It can & does happen. So, just compress it and be done with it, why not? The problem is, pixel accurate drawing of compressed textures only serves to exaggerate the artifacts caused by the compression. Sometimes these artifacts can cause the resulting texture to fail to meet quality requirements.

The good news is that the final result actually tends to benefit if compressed textures are drawn a half pixel off because the filtering that occurs for free will smooth out the artifacts caused by compression. It still won't look as good as uncompressed textures, but it will often be an improvement over pixel accurate rendering. By maintaining a consistent half-pixel offset in rendering these elements, you continue to benefit from the filtering and also eliminate strobing if the UI moves.

Here's an example using the level picker button in Star Ninja. Click for a zoomed in view of the button. The uncompressed and DXT5 versions are the direct output of the content pipeline. I had to make the bilinear filter version in photoshop due to time constraints but it does reasonably match the results I was seeing in the game engine when I was originally doing these tests.

The uncompressed obviously looks the best. The DXT5 compressed version looks the worst because of how the gradients don't respond well to the compression, particularly on the top and bottom edge. The last version is the same DXT5 texture shifted 0.5 pixels which has the effect of blurring out the compression artifacts.

Another option for crisp UI with DXT5 compression is to make sure the artists carefully inspect the compressed results of their work and modify the texture until the artifacts are not a problem. This sounds reasonable, and can sometimes be depending on the artist and toolchain, but if the assets are going to be procedurally packed into a sprite sheet (like this tool does) then it may be difficult to guarantee the artifacts for one sprite sheet layout will be the same as another. This is because the DXT5 compression is based on a compressing 4x4 pixel blocks; if the sprite sheet is built where the elements can shift within the 4x4 block the artifacts will be different. This page has a good explanation of how the compression works.

While uncompressed textures are generally preferred for user interfaces, the use of compressed textures is sometimes necessary but will result in compression artifacts. By rendering these elements pixel accurately but offset one half pixel in x & y, a consistently smoothed version of the texture will be rendered which will mask some of the compression artifacts and can improve the final image quality.

Tuesday, May 17, 2011

The insidious nature of unmanaged resource leaks in XNA games.

While working on Star Ninja's screen transition system last week, I discovered a memory leak problem that was ultimately determined to be the result of a simple oversight - The instanced model system wasn't disposing the vertex buffers and index buffers it had created. These are created at runtime to prepare large arrays that are a series of duplicates of the master instance data with the bone indices and index indices set up to do the SkinnedEffect instancing technique as described in one of the official XNA samples. Finding that out was a lot more time consuming than I'd have liked. The only reason I noticed it was because of another bug that had sized the vertex buffer allocation incorrectly and when I fed a larger mesh into the system, memory problems started to show up.

While I do most development and testing on a PC, mainly for the time saving benefit of "edit and continue" which isn't available on WP7 or XBox, I do make a point of running and doing a few quick tests after every significant task to make sure everything is still working well. Recently, I added a conditional compilation symbol "STRESS_TEST" which allows me to just run the game and it will churn through all the levels doing pretty random stuff. The tester is a great way for me to monitor for peak memory usage; the code will periodically print out the peak memory retrieved from Microsoft.Phone.Info.DeviceExtendedProperties. Since certification requires the game to stay under 90MB, this is a pretty important thing to stay on top of.

After the last batch of changes where one small part included changing which assets were being fed into the instanced object renderer, I saw the memory usage spike unexpectedly and almost immediately over 100MB and more as time went on. So I ran the game using a CLR memory profiler (YourKit for .NET) which to my surprise wasn't telling me anything useful - object counts were pretty similar and managed heap was about the same. This suggested to me that the memory usage being reported by WP7 is process memory and not just the managed heap. While interesting to know this, it doesn't help in finding what was sucking all the extra memory.

Because the asset change was just one change of what was probably a few too many, I didn't immediately recognize that as being related to the problem. Because my usual reference-leak-finding technique weren't working, I found myself disabling large swaths of code to narrow down the source of the unmanaged memory leak. After a few hours of divide & conquer, I eventually I found the problem was due to these vertex & index buffers not being disposed. The code was simply not releasing these unmanaged resources and neither the CLR or the XNA API had any way to know that something needed to be cleaned up. With this failure now understood, I reviewed all the code for anything that might need a call to Dispose(), called it at the appropriate time, and the unmanaged memory leaks disappeared.

In the end, a lot of time was wasted due to not paying enough attention to objects implementing the IDispose interface. Sufficiently chastised by this, I made a point of reviewing the Dispose Pattern in case it would help avoid this in the future. I had glossed over it before but had avoided using it because C#'s limitation that classes can only derive from one other class made me leery of introducing that kind of limitation to the general codebase without a good reason. Most of the time, interfaces provide the desired results with only a little more work and without the single-derivation limitation and so using interfaces rather than derivation had been my typical approach. This memory leak situation provided the motivation required to start using the dispose pattern in the hopes that future errors could be avoided.

To my surprise, I soon found there isn't a standard Disposable base class in .NET. Why not? Perhaps because it's so simple that people just write them whenever needed? Who knows. Simpler classes seem to be provided on a regular basis, but that's how it is.

So here's one you may find slightly more useful than the basic Disposal pattern.

/// <summary>
 /// A generic implementation of the Dispose pattern, useful for classes that need IDisposable, don't need to 
 /// derive from something else, and are used as a base class for other classes.
 /// </summary>
 public class Disposable : IDisposable
  /// <summary>
  /// Set to true as soon as Dispose is called and before the
  /// call to Dispose(true) is made, which means this bool is 
  /// only useful to code outside the scope of the disposal process.
  /// </summary>
  public bool IsDisposed { getprivate set; }
  public void Dispose()
       throw new Exception();
   IsDisposed = true;
  protected virtual void Dispose(bool disposing)

This has one feature beyond the standard Dispose pattern - a bool that is set when it is disposed, which can be a useful debug build assertion in code that uses the object, particularly when an object is being bounced around among various systems and detecting a disposed object can prevent more mysterious exceptions at a lower level. Also, I couldn't think of a situation where it would be useful to call Dispose on an object twice, so the check for IsDisposed in Dispose() will help find situations where this somehow happens by accident.

The bool IsDisposed could perhaps be made visible to #DEBUG builds, and the check converted to a Debug.Assert(), but I prefer to have these failures detected at the earliest time in all configurations in order to prevent other, possibly more subtle, bugs from occurring. While it's true that an object that has had Dispose() called upon it can technically continue to be used, I prefer to consider Disposed objects to be "off limits" where I expect them to be inert and ready for garbage collection so I often check the IsDisposed property at entry points to large systems just to make sure an object is still valid.

After reviewing the code for all IDispose interfaces and converting all suitable classes to derive from the new Disposable base, I found the code in general was a little more organized (especially class hierarchies where multiple classes in the hierarchy implemented IDispose) and generally more robust in that I knew the class finalizers would be taking care of any disposals that, for whatever reason, weren't explicitly triggered.

While the Disposal class is all well and good, the main thing to remember is to always make sure to pay attention to the objects you create and if they implement IDisposable, make sure it's getting called at some point because unlike most of C# there isn't any magic code that will clean it up for you. The Disposable class doesn't eliminate the need to actually write the code to dispose the objects you are responsible for, but it can make your library code a little more robust when you provide a Disposable as a return value because this way the finalizer will at least be sure to eventually dispose the object in case the caller doesn't.

Thursday, May 12, 2011

Why XBox Live Gamer Tags should be accessible to all Windows Phone games

Let me start by first saying, I fully appreciate the need and reasons behind having the XBox Live branding on the Windows Phone be a marketplace tool to help identify AAA phone games to the customers. It does a lot for the platform to have some part of the catalog include titles that have (generally) higher production values, QA and all the rest that comes with the responsibilities of an XBL contract. Limiting access to leaderboards, friends, avatars and other XBox Live features is, for the most part, a reasonable thing that allows Microsoft to ensure the quality of games using these features meets their standards and gives a little more value to the developers who make the effort to be part of XBox Live.

There is one feature however that really, really, needs to be made available to all Windows Phone game developers: the gamer tag. Providing access to this will allow developers who who have any kind of online component to their games have some kind of reasonable expectation that we won't be exposing our gamers to profane or otherwise unacceptable names. It also makes the task of porting games between XBLIG and WP7 that much easier, because XBLIG games have access to the gamertag (despite not being XBox Live products).

Right now, smaller developers have no choice but to accept user generated names, feed them through banned word filters and implement some kind of reactionary cleanup system for when bad names do sneak through. Unless we make users do a registration process or access unique phone IDs (which has its own set of issues), we are unable to uniquely identify users. This is really not doing anyone any good, and it makes all non-XBox Live WP7 multiplayer games suffer as a result. The obvious problems are multiplayer experiences that have parental control issues. An even more serious problem for the developer is that we have to spend an inordinate amount of time dealing with the fact that there is an API that could give us a safe player name, but simply chooses not to so we wind up spending a lot of time on something that could otherwise be used to make a higher quality game.

I can only hope that the right people at Microsoft will change their mind about letting WP7 apps access gamertags because it would improve the quality of multiplayer WP7 games for the entire platform. This is, presumably, a simple switch that could be done in time for Mango. I'm not holding my breath for this change but it would be great if it happened.

Tuesday, May 10, 2011

Adding ImageMagick to Star Ninja's XNA content pipeline

This week, I've been working on Star Ninja's high score system. The underlying UI system is the same one created for Atomic Sound and Moonlander, which uses a custom content pipeline that does a lot of things to prepare data for the UI system. One of the features is to process fonts and incorporate them into a sprite sheet, saving various bits of metadata required to render these fonts later. Since our content pipeline does a lot of different things well beyond the scope of this post, I'm going to limit the example code to the parts related to integrating ImageMagick into the bitmap font generator tool found on the XNA App Hub.

Back to the problem at hand.

During development of the high score screen, I found myself looking at this (which is populated with random data for now):

Not bad, I was thinking to myself, fairly pleased to be able to make something I didn't consider terrible. Something wasn't great though, which is the actual high score table font. Plain white and boring, it really needed something to stand out a bit more because it seemed to blend with the background too much. I wanted a better looking font, one that had shadows and perhaps other features. So I thought about it for a bit, realizing what a hassle it would be to create color fonts in Photoshop and making the existing font rendering pipeline use that data instead of the existing font rendering technique. Doable, but not ideal. Time is short and that seemed like a terribly cumbersome process especially when I considered the inevitable "can you make the shadow a little bigger" or other change requests that might come in. Custom images per character is an entirely unacceptable way to solve the problem - error prone and difficult to track font metadata (spacing, mainly). I don't mind creating one-off assets, but if there's any real chance of having to iterate then a system to automate the process is often justified.

So I looked into a couple things, the first being Photoshop Scripting. Rejected this because Photoshop scripting is more of a UI automation and not a background process suitable for a content building script. The second one I looked at was ImageMagick, which turned out to not only be pretty cool, but well suited for this task. It's an image processor that is typically used as a console command in batch files, but it includes an OLE component which allows me to use it slightly more easily within the content pipeline. It wouldn't have been much trouble to start up a batch job and use the console command, but the OLE component makes it all a bit cleaner. I couldn't find any examples of using C# with ImageMagick's OLE component, so it seemed like a good idea to write up a little bit about how it can be used within the context of XNA content processing.

The font processor we use is similar to what is found in bitmap font generator found here. To gain access to ImageMagick, just add a reference to the ImageMagick OLE object to the bitmap font generator project (or your content pipeline).

Around line 190 of MainForm.cs in the XNA bitmap font generator, you will see the bitmap that is generated by rasterizing a character from a font.

                        // Rasterize each character in turn,
                        // and add it to the output list.
                        for (char ch = (char)minChar; ch < maxChar; ch++)
                            Bitmap bitmap = RasterizeCharacter(ch);

That's where we hook in. To do something useful with ImageMagick, you will need to set it up and pass it a command line. Because the OLE component takes the arguments as an array of objects, each object being a string with each argument, I wrote a helper function to split a standard string into this array:

object[] GetImageMagickArgs(string args)
   if (args == null || args.Length == 0)
    return null;
   string[] args1 = args.Split(' ');
   object[] result = new object[args1.Length+2];
   for (int i = 0; i < args1.Length; i++)
    result[i+1] = (object)args1[i];
   return result;

You may notice how the string[] is copied to the object[]. This is because the ImageMagic API requires the type of the array to be exactly an array of objects and string[] doesn't match so it will throw an exception if you don't do this.

I get the ImageMagicArgs using a custom parameter to our content pipeline, so you'll need to find a suitable way to get the arguments to your content pipeline or to the bitmap font generator if you are using that. Once the bitmap and arguments are ready, this function can be called to process the character:

private Bitmap ProcessCharacter(object[] args, char ch, Bitmap bitmap)
   if (args == null || args.Length == 0)
    return bitmap;
   int chInt = (int)ch;
   var src = "c:\\temp\\char-" + chInt.ToString() + ".png";
   var dest = "c:\\temp\\char-" + chInt.ToString() + "-output.png";
   bitmap.Save(src, ImageFormat.Png);
   var m = new ImageMagickObject.MagickImage();

   args[0] = src;
   args[args.Length - 1] = dest;
   var r = m.Convert(args);
   var bitmap2 = Bitmap.FromFile(dest);
   return (Bitmap)bitmap2;

Finally, add the processing of the character to the point mentioned above:

var imageMagicArgs = GetImageMagickArgs(ImageMagickString);
                        for (char ch = (char)minChar; ch < maxChar; ch++)
                            Bitmap bitmap = RasterizeCharacter(ch);
bitmap = ProcessCharacter(imageMagicArgs, ch, bitmap);

The end result as you probably can see is that each character is rasterized as normal, then fed to ImageMagick as a temporary file and then later re-read back into the bitmap.

This technique is not without shortcomings. Ideally, I would have used the ImageMagickObject.Stream method to feed the data in directly without the use of a temporary file. However, the docs for this are sorely lacking and it wasn't worth the time to figure out - the temporary files blast through so fast I really don't see the need to spend more time on that. For whatever reason, ImageMagick was keeping a write lock on each file until the component was finalized so I had to create a different file for each character (there is no Dispose method to control this, unfortunately). The biggest shortcoming of this however is that the processed bitmap is the same size as the input bitmap which means processing that spills over the edge will leave the rasterized/processed font with a visible edge. Ideally I would resize the bitmap to contain any effects that might be created for the font and then crop it and adjust font/spritemap metadata accordingly when it was done. I may do this at some later time, but for my specific needs today, this works. I just needed a small effect, something that fits within the existing bitmap.

By feeding in the ImageMagick string "-alpha on ( +clone -channel A -blur 0x1.5 -level 0,50% +channel +level-colors black ) compose Over +swap", feeding the fonts through the pipeline and then running the game, I was rewarded with this image:

Much better!

Here's a closeup of the letter 'A' with and without processing:

The text is much more clear against the background and we now have a system which can be used for any font in any of our games moving forward. As a huge bonus, it's pretty much automated and we can tweak settings and regenerate the textures without dealing with individual letter image files. With a little more work to support larger output bitmaps, more dramatic effects could be used but this is a good solid step in the right direction.

In other news, Star Ninja is going to have local & global high scores tracked across four different game modes! :)

Tuesday, May 3, 2011

Using PIX to help figure out graphics glitches in Star Ninja

As Star Ninja rapidly approaches completion, I've been working on a lot of game polish tasks. I really want this game to make a great first impression and to that goal I've been working on streamlining the UI and making transitions between screens look nice. 

Recently, there was a problem with the screen transition logic that rendered a cross fade between the level selection and the gameplay screen over the course of a second or so. For a while, I didn't really think too much about what was essentially a 1-frame screen flicker but once I noticed it I knew it had to be fixed. 

Single frame render glitches are always hard to deal with unless you have the right tools & process. The first goal is to identify what is really going on. Second, reproduce it reliably and quickly; without that, a lot of time can be wasted. At this point you can iterate with the debugger and tools to take a close look at what is usually a problem with a lot of moving parts. That's where PIX comes in. 

To give an idea what I was looking at, here is the level picker menu, the glitched screen and a frame not long after the transition was done *Note: the art and level is not final, this is a game in development after all!
The menu screen cross fades with the game, but at the last frame of the crossfade it was doing this:

Here's the frame right after it:

Pretty glitchy there in the middle, but since it's only for one frame it's almost impossible to detect it as more than just a flicker. PIX can record an XNA application's stream of graphics device calls, giving you the ability to analyze every last call made to DirectX. This is an enormously useful tool for diagnosing problems such as these.

Because this is happening for only a single frame, I chose to record the stream rather than fumble with breakpoints which is always a hassle when dealing with UI and timing related problems. To do this, the PIX experiment needs to be set up as such:
Note that you have to create and configure each trigger, there isn't a magic "setup stream recording" button. No big deal once you know what to do though. I find I need to check "Disable D3DX analysis" on the Target Program tab when using PIX with XNA apps, it doesn't work without that for me. Might be my system configuration, or maybe an XNA compatibility issue (who knows).

So, once set up, click Start Experiment to run the game. Press your key to start and stop the stream recording to capture the problem. Exit the application, wait a moment, and PIX will pop up a new window like this:

From here, you can "scrub" the video to any frame and drill down into any frame to inspect the sequence of DirectX events. After a bit of digging around in the data, I found that the cause of the problem could be seen by selecting the "Render" tab, find the correct frame of the stream that showed the problem and then select the Depth channel in the "Channel(s):" combo box. This is what I saw:

Clearly, the menu was writing the depth buffer and the game screen wasn't able to draw correctly because of this. Keep in mind this was happening during the transition, where the code is actually drawing both screens to make the fade effect work. This is a new situation for the game because prior to the transitions being added, all screens were rendered without concern for how they might interact with other screens. 

I didn't want to render into a render targets and then back to full screen quads because that would be too slow. The code is already doing a render target for the gameplay screen as it fades in; this problem happened on the very last frame as the gameplay screen was rendering directly to the back buffer like it does when it's not part of a transition. What was happening is the gameplay screen was affected by the previous frame depth buffer results, but only for the one frame. To save an extra bit of time here and there, the game doesn't do a full screen clear at the beginning of each frame unless it is known to be necessary. Some people have told me that the screen clears are so fast that I shouldn't bother, it's too early to optimize, and I should just clear it whenever its convenient, but the fact is the PIX logging shows the Clear operation takes enough time that it's worth it to me to avoid doing when possible or useful. During transitions, the phone is already using a lot of GPU because of the render target usage so this is a good time to avoid wasteful operations. Optimizing early is sometimes a bad thing, but if I know at the outset that between various options one is faster than another and not too much trouble to do I'll always go for the faster option because it tends to make the entire application more robust in the long run with fewer architectural performance problems to go back and wish I had done right in the first place.

To fix this, I simply added this line of code between the two screens to reset the depth buffer during the transition, causing the screen draw order to determine visibility:

GraphicsDevice.Clear(ClearOptions.DepthBuffer, Color.Black, 1, 0);

In the end, it was a simple bug that was easy to fix. Without PIX, I would have been left recording video and analyzing it frame by frame and guessing what was wrong. Fortunately I was able to use PIX to quickly identify and solve the problem.

Monday, April 18, 2011

Farseer inactive object optimizations and other minor features

I'm about to submit a patch for the optimizations I mentioned in my previous post. The description field on CodePlex has a size limit which doesn't allow me to put all my notes in there so I'm putting a copy here for people to reference.

I've been running with these optimizations for almost two weeks now and everything seems to be working correctly both in our next game and in all the Farseer testbed examples. Hopefully I didn't overlook anything :)

Here's the full notes:

This commit contains a number of optimizations primarily related to the overhead of inactive objects. By request this includes all my other recent changes in a single patch including fixes to joint math and a few other small features. Inactive objects now have very little overhead, and fixtures can be selective about when CCD is performed between objects.

There are #defines to control the new features to help with testing their behavior against the unoptimized (previous) version of the logic they control. I considered removing them for the submit but I thought given the complexity of the various optimizations that it might be best to leave both code paths in place for the time being and we can remove the slower code path in a subsequent commit. Also, it helps to show what optimization each code change is related to given this is a large commit with what are for the most part unrelated changes. The #defines are currently in the .cs files, sometimes #defined in multiple files as needed to keep code in sync.

This commit also includes ports of Box2D rev 167 which has joint related fixes.

The #defines are:
USE_AWAKE_BODY_SET - Reduces iteration costs to only the bodies that are awake.
USE_ACTIVE_CONTACT_SET - Reduces iteration costs to only the contacts that are active.
USE_IGNORE_CCD_CATEGORIES - Allows content to define fixtures that ignore CCD with only a subset of objects. This allows for the selective use of CCD where it has value instead of the all-or-nothing setting previously available.
USE_ISLAND_SET - Reduces iteration costs for island logic to only bodies that are determined to be in an active island.
OPTIMIZE_TOI - Reduces iteration costs of CCD logic to only the bodies participating in CCD.

Body changes:
* Body has been optimized to use "Awake Body Sets" which causes various iterators to only evaluate active objects.
* Body.Body no longer forces new bodies to Awake.
* Body.BodyType now sets Awake=false if the body type is set to Static, and wakes the body otherwise.
* Body.Awake.set now updates contacts when the body is set to Awake.
* Body.Awake.set now adds or removes the body from the AwakeBodySet as needed.
* Body.Awake.get now returns false if the body type is Static.
* Added bool Body.InWorld, which tracks if the Body has been added to the world or not.
* Added Category Body.IgnoreCCDWith. Existing behavior is unchanged if unset. This can be useful for reducing CCD overhead. Note this is a setter only, and that it propogates the value down to fixtures the same way that the existing CollidesWith and similar setters do.

ContactManager changes:
* Added ContactManager.ActiveContacts which is used to limit which contacts are evaluated during updates.
* ContactManager.Destroy(Contact contact) updated to perform active contact management.
* ContactManager.Collide now iterates only the contacts found in the active contact set, adding/removing members as needed.
* Added ContactManager.UpdateContacts(ContactEdge, contactEdge, bool value) which is used by Body objects when their Awake status changes.
* Added ContactManager.RemoveActiveContact(Contact contact) which is used by Contact.Destroy() to ensure the active contact set is properly updated.

Other contact related changes:
* Contact.Destroy() now calls ContactManager.RemoveActiveContact(this).
* ContactSolver.InitializeVelocityConstraints() now sets k_maxConditionNumber = 1000.0f to match Box2D rev 167.

Fixture changes:
* Added Category Fixture.IgnoreCCDWith which allows specific fixtures to ignore CCD with specific categories of objects. This allows fixture to be configured to ignore CCD with objects that aren't a penetration problem due to the way content has been prepared, such as slow moving fixtures that usually only iteract with static objects but occasionally need to react to bullets.
* Fixture.Fixture now sets _collisionCategories = Settings.DefaultFixtureCollisionCategories and _collidesWith = Settings.DefaultFixtureCollidesWith, allowing apps to configure default values in the settings.
* Fixture.Fixture sets IgnoreCCDWith to Settings.DefaultFixtureIgnoreCCDWith.
* Added Fixture.UserBits, which is a long value for use by the application. This is unused by Farseer, but it is Cloned and used by CompareTo as expected.

Joint changes:
* DistanceJoint has had some comments pulled over from Box2d.
* FixedRevoluteJoint.LimitEnabled only wakes the body if the limit changed and sets _impulse.Z to zero as part of Box2D rev 167.
* FixedRevoluteJoint.LowerLimit only wakes the body if the limit changed and sets _impulse.Z to zero as part of Box2D rev 167.
* FixedRevoluteJoint.UpperLimit only wakes the body if the limit changed and sets _impulse.Z to zero as part of Box2D rev 167.
* Added FixedRevoluteJoint.SetLimits(float lower, float upper) to match Box2D.
* Renamed FixedRevoluteJoint.MotorTorque to MotorImpulse to correctly reflect what it returns.
* Added FixedRevoluteJoint.GetMotorTorque(float inv_dt) to match Box2D.
* FixedRevoluteJoint.SolveVelocityConstraints has been updated to match Box2D rev 167, which has various mass related math fixes.
* PrismaticJoint.LowerLimit, .UpperLimit, .SetLimits, .MotorForce->.MotorImpulse, .GetMotorForce have been updated in similar ways to FixedRevoluteJoint for Box2D rev 167.
* Added RevoluteJoint.RevoluteJoint(Body bodyA, Body bodyB, Vector2 worldAnchor), which simply calculates the local anchors for both bodies.
* RevoluteJoint.LowerLimit, .UpperLimit, .SetLimits, .MotorTorque->.MotorImpulse, .GetMotorTorque and .SolveVelocityConstraints have been updated in similar ways to FixedRevoluteJoint for Box2D rev 167.

World changes:
* Added World.AwakeBodySet which tracks all active bodies.
* Added World.AwakeBodyList which is a short term list used during updates.
* Added World.IslandSet which is a temporary set used during World.Solve.
* Added World.TOISet which is a temporary set containing objects participating in CCD.
* World.World now initializes the AwakeBodySet, AwakeBodyList, IslandSet and TOISet.
* World.RemoveBody removes the body from the AwakeBodySet if needed.
* World.ProcessChanges asserts that all bodies in the AwakeBodySet are in the BodyList (#if DEBUG only)
* World.ProcessAddedBodies adds/removes bodies from the AwakeBodySet as needed, and sets Body.InWorld to true for all added bodies.
* World.ProcessRemovedBodies asserts the AwakeBodySet doesn't contain the body being removed, since that would indicate an earlier failure to maintain lists correctly. This is checked again at the end because callbacks could (and have) re-added objects incorrectly by indirectly causing Body.Awake to be set to true.
* World.ProcessRemovedBodies sets body.InWorld to false.
* Added World.SetIsland(Body body), which is used for keeping track of all bodies that are in an island.

World.Solve changes:
* Iterates only the active contacts to clear the ContactFlags.Island flag.
* Iterates only the active bodies during the main solve loop.
* Adds each nonstatic body to the IslandSet during the main solve loop to help with fixture update loops (below)
* World.Solve fixture update loop changes:
* Iterates only the bodies in the Island set to update their fixtures (instead of all bodies in the world).
* No longer expects BodyType.Static objects, and asserts if it finds any because they shouldn't be moving and thus shouldn't be in the island update set.
* Adds all bodies in the IslandSet to the TOISet which is used later to optimize SolveTOI. Note the TOISet is not necessarly empty, and this is not just a copy of the IslandSet.

World.SolveTOI changes:
* Only clear BodyFlags.Island and init Sweep.Alpha0 for bodies in the TOISet (instead of all bodies in the world).
* Invalidate the TOI only for active contacts (instead of all contacts in the world).
* Find TOI events only for active contacts (instead of all contacts in the world).
* Add support for Fixture.IgnoreCCDWith.
* After determining two bodies need to interact, if the TOI step is complete, for both bodies if they aren't yet in the TOISet clear the BodyFlags.Island and init Sweep.Alpha0.
* Clear the TOISet if the call to World.SolveTOI was the last iteration.

FarseerPhysics:Settings changes:
* Added public static Category DefaultFixtureCollisionCategories = Category.Cat1. This is used by the Fixture constructor as the default value for Fixture.CollisionCategories member.
* Added public static Category DefaultFixtureCollidesWith = Category.All. This is used by the Fixture constructor as the default value for Fixture.CollidesWith member.
* Added public static Category DefaultFixtureIgnoreCCDWith = Category.None. This is used by the Fixture constructor as the default value for Fixture.IgnoreCCDWith member.

Misc changes:
* GameSettings now has float Hz, used in place of hard coded 30hz(phone) / 60hz (pc/xbox) used by various tests.
* RevoluteTest.RevoluteTest has been updated to match Box2d rev 167.
* SliderCrankTest.Update now uses GetMotorTorque(settings.Hz)
* DynamicTree.Rebalance has been removed (Erin Catto confirmed this is no longer required).
* DynamicTreeBroadPhase.UpdatePairs no longer calls DynamicTree.Rebalance.
* HashSet<T>.CopyTo has been implemented.

Tuesday, April 5, 2011

Recent Farseer optimizations

We're working hard on our next game right now and one thing I've been focusing on is the performance of the Farseer Physics Engine. It's a great engine but I have seen a few things that could use some improvements so I spent some time recently doing just that.

A few days ago, I made some changes for which I submitted Farseer patch #9000 to Codeplex.The original motive was to fix some revolute joint bugs by bringing over some fixes from the Box2D engine which Farseer is based upon. To my pleasant surprise, the update also resulted in a nice performance boost thanks to the code no longer needing to rebalance a tree structure all the time. At first I was nervous about seeing that change, but I checked with the Box2D forum where Erin Catto (the admin/author of Box2D) was kind enough to confirm this is a valid change. Check out the comparison shots (the optimized version is on top):

So that was pretty nice performance bonus that came along with the revolute joint fixes. I've been using these changes locally for a few days now and it seems to be working fine so I hope to see them applied to Farseer at some point.

While looking into all this I saw a few things I could do to improve performance further, so I did. I added four different optimizations, listed in order of gain:

  1. Active Contact sets. The world no longer iterates every Contact during World.Solve(), World.SolveTOI() and ContactManager.Collide(); it instead maintains a set of contacts that are active and reduces how many objects it has to evaluate. In some cases where scenes are mostly inactive, this can provide a dramatic improvement and should allow Farseer games to be much more heavily populated with inactive objects.
  2. Awake Body sets. The world no longer iterates every Body during World.Solve(); it maintains a set of Body objects which are active. This is enough information for the Solve code to perform collision island processing while iterating the minimum number of Body objects.
  3. Island Body sets. This last optimization allows the World.Solve() to track all the Body objects that had been processed by the island code so they may have the appropriate secondary logic applied to them (synchronizing fixtures). This avoids another iteration of every body in the world. 
  4. Body.Awake never returns true if the body is static, allowing various bits of code to do less work. I haven't yet seen a reason why a static body would ever need to be reported as Awake, and I haven't seen any side effects from this change. The main benefit of this is to make the above optimizations more effective since the related code no longer iterates static objects except as a byproduct of the island logic.
With these three changes, it seems most of the "iterate everything" logic in the core solvers is no longer happening for objects that aren't awake. This means sleeping objects have had their per-frame recurring overhead reduced to less than 20% of its original amount. Sleeping objects are so low on the radar at this point that I don't think I'll be concerned about loading up scenes with more objects so long as I can ensure they don't all activate at once. This was actually an issue with Moonlander because the original design was to make a fairly large map and leave it resident. I wound up having to add/remove terrain chunks as you fly around due to the surprisingly large overhead that having a bunch of bodies in the world will incur.

Here's a screenshot of before/after:

The contact management was the lions share of the optimization here, pulling about 85% of the gains. The awake body logic was about 10% and the island set was a pretty minor one at around 5%. Those numbers are approximates, I didn't take notes. A quick glance at the branch of Box2D I have here suggests it could benefit from a port of these changes, fwiw.

I've tested these changes against all the TestBed samples and our game and everything seems fine but I'm going to wait a little longer before building a patch for Codeplex because I just want to let it cook a bit just in case something pops up. There's also a chance I may be able to reduce that CCD time.

These changes are looking promising right now. Hopefully I'll be able to put a patch up in a week or so once I've become more comfortable with the changes and wrapped up any other related optimizations.

Wednesday, March 23, 2011

WP7 Devs- Ever get tired of dealing with the locked screen when you press F5?

There's an easy way to avoid ever having a deployment/debugging session aborted due to a locked screen. As you know, Windows Phone doesn't let us turn off the screensaver. If you are like me, you also don't want to disable & re-enter your password all the time either.

There's an easy fix for this problem though.

Just create a generic XNA game project called "StopLock" or whatever suits you, and this line to the Game constructor:

Guide.IsScreenSaverEnabled = false;

Build the project, deploy it, and run it before you start a coding session. When the game you are actually working on is started by the debugger, it will never give you any errors due to the device being locked because StopLock will have kept it open. When your program terminates, it will resume the StopLock app which will once again disable the screen saver.

Monday, March 21, 2011

Poll: Does your XNA work pay the bills for you?

Does your XNA dev work pay the bills?

An interesting question came up in IRC today. How many people doing XNA work are really able to survive at it, financially?

If you do XNA work, it would be interesting to know what kind of success you are seeing. Obviously this poll won't reach everyone doing XNA work but the results might be interesting anyway.

Edit: By XNA work, I'm referring to XNA in whatever combination of products & revenue you are able to pull together where XNA code & processes are a core aspect of the work.

Edit 2: Please only vote if you are an XNA application builder, not someone who actually works on XNA at Microsoft. I thought it was pretty clear what the intent was here but there's been a few questions about this. Thanks.

The poll is set up to run for ten years but you can view stats or change your answer at any time should your situation change :)

So, does your XNA work pay the bills for you?

If your answer is "Yes" and you feel like talking about your business model a little bit, please feel free to add a comment. Thanks!


Thanks for swinging by. As you can see, this is my first post to the new Bounding Box Games dev blog. Hopefully what I post here will be of interest to people who come here as well as a way for me to get a little better at what I do by thinking more deeply about things related to game development while I write. I'm planning on discussing various topics related to XNA, XBLIG, Windows Phone and whatever else I encounter that seems relevant  during my work on Bounding Box Games projects. I'll be doing my best to keep up with any comments or questions people have while I'm at it.

Bounding Box Games LLC is a small operation right now. It's just me trying to make this crazy thing work. I'm thankful to have a very supportive and understanding wife who could see how important it was to me to leave the comfort and security of a great job to try to make this happen. It wasn't easy to take that first step, in fact I wasn't really sure I was ready for it until one day at work I realized I was at the perfect time between projects where I wasn't going to be letting anyone down by leaving and the emerging Windows Phone market was looking very interesting to me. It seemed like an opportunity would be wasted if I didn't try, right then. So here I am!