Tuesday, April 5, 2011

Recent Farseer optimizations

We're working hard on our next game right now and one thing I've been focusing on is the performance of the Farseer Physics Engine. It's a great engine but I have seen a few things that could use some improvements so I spent some time recently doing just that.

A few days ago, I made some changes for which I submitted Farseer patch #9000 to Codeplex.The original motive was to fix some revolute joint bugs by bringing over some fixes from the Box2D engine which Farseer is based upon. To my pleasant surprise, the update also resulted in a nice performance boost thanks to the code no longer needing to rebalance a tree structure all the time. At first I was nervous about seeing that change, but I checked with the Box2D forum where Erin Catto (the admin/author of Box2D) was kind enough to confirm this is a valid change. Check out the comparison shots (the optimized version is on top):

So that was pretty nice performance bonus that came along with the revolute joint fixes. I've been using these changes locally for a few days now and it seems to be working fine so I hope to see them applied to Farseer at some point.

While looking into all this I saw a few things I could do to improve performance further, so I did. I added four different optimizations, listed in order of gain:

  1. Active Contact sets. The world no longer iterates every Contact during World.Solve(), World.SolveTOI() and ContactManager.Collide(); it instead maintains a set of contacts that are active and reduces how many objects it has to evaluate. In some cases where scenes are mostly inactive, this can provide a dramatic improvement and should allow Farseer games to be much more heavily populated with inactive objects.
  2. Awake Body sets. The world no longer iterates every Body during World.Solve(); it maintains a set of Body objects which are active. This is enough information for the Solve code to perform collision island processing while iterating the minimum number of Body objects.
  3. Island Body sets. This last optimization allows the World.Solve() to track all the Body objects that had been processed by the island code so they may have the appropriate secondary logic applied to them (synchronizing fixtures). This avoids another iteration of every body in the world. 
  4. Body.Awake never returns true if the body is static, allowing various bits of code to do less work. I haven't yet seen a reason why a static body would ever need to be reported as Awake, and I haven't seen any side effects from this change. The main benefit of this is to make the above optimizations more effective since the related code no longer iterates static objects except as a byproduct of the island logic.
With these three changes, it seems most of the "iterate everything" logic in the core solvers is no longer happening for objects that aren't awake. This means sleeping objects have had their per-frame recurring overhead reduced to less than 20% of its original amount. Sleeping objects are so low on the radar at this point that I don't think I'll be concerned about loading up scenes with more objects so long as I can ensure they don't all activate at once. This was actually an issue with Moonlander because the original design was to make a fairly large map and leave it resident. I wound up having to add/remove terrain chunks as you fly around due to the surprisingly large overhead that having a bunch of bodies in the world will incur.

Here's a screenshot of before/after:

The contact management was the lions share of the optimization here, pulling about 85% of the gains. The awake body logic was about 10% and the island set was a pretty minor one at around 5%. Those numbers are approximates, I didn't take notes. A quick glance at the branch of Box2D I have here suggests it could benefit from a port of these changes, fwiw.

I've tested these changes against all the TestBed samples and our game and everything seems fine but I'm going to wait a little longer before building a patch for Codeplex because I just want to let it cook a bit just in case something pops up. There's also a chance I may be able to reduce that CCD time.

These changes are looking promising right now. Hopefully I'll be able to put a patch up in a week or so once I've become more comfortable with the changes and wrapped up any other related optimizations.

1 comment:

  1. Quick update, I was able to optimize the CCD time for the sleeping sphere test down to numbers in the 25-50 range by applying a similar optimization there. I think the total overhead reduction for sleeping objects is now somewhere above 90%.