Benchmarking gotchas

January 21, 2009 on 12:45 am | In Actionscript | 11 Comments

Just a quick tip to get the numbers right when testing the execution time of your ActionScript code; you only get correct results if you follow two steps:

  1. Compile your code with the debug flag set to false (mxmlc: -debug=false). Otherwise your swf gets bloated with byte code used by the debug player.
  2. Always test your swf with the release player. Most developers have set the debug player as the default one, but it doesn’t reflect real world performance because it can be much slower. If the context menu has the option “show redraw regions” you are running the debug player.

If you are not aware of this you might unintentionally spent much time in implementing some clever optimizations which appear to run faster in the debug player, but perform worse in the release player. Here is an example for the sin/cos approximation:

Debug player Release player
Math.sin() + Math.cos() 340 ms Math.sin() + Math.cos() 142 ms
inline sin + cos approximation (high precision) 440 ms inline sin + cos approximation (high precision) 11 ms

Feature-based geometrical algorithms

January 11, 2009 on 3:29 pm | In Actionscript, Physics | 11 Comments

Performing collision queries over a large number of objects is computationally expensive and therefore it’s important to use clever algorithms to avoid bottlenecks. After all potentially colliding object pairs have been sorted out during the broad phase, the remaining task is to run some algorithms to determine if and where the objects intersect. Of course there are tons of ways to do this.

motor2 mainly uses a SAT approach for this task, which is fast, easy to implement and quite robust. The only disadvantage is that you end up with an unpleasant quadratic-time complexity when shapes come into contact. In hope to improve this situation I spent some time learning different collision detection algorithms and stumbled across some papers talking about feature-based algorithms which like the name implies operate on the features (vertex, edge, face) of two polygons/polyhedra to determine if they are disjoint.
To say in advance: It’s really hard to find a better replacement for SAT in 2D because things are just so simple in two dimensions (this behaves differently in 3D, where SAT can be really slow for complex polyhedral shapes, since you need to check many more separation axes…).

The Lin-Canny algorithm [1] seems to be the most commonly used of these types of algorithms. The V-Clip, or Voronoi-Clip algorithm [2] is an enhancement of Lin-Canny and according to the author offers superior numerical robustness and can handle penetrations and degenerate situations as well.
I’m not quite sure if the well-known GJK algorithm (Gilbert-Johnson-Keerthi) falls under this category as well – from the geometrical point of view it’s simplex-based and looks at the minkowski difference.

Anyway, while I found many Lin-Canny implementations, V-Clip is a bit rare and after adopting the algorithm to 2D (fun!) I was wondering why so few people are using it until Erin Catto, the brain behind Box2D, pointed out that the algorithm is patented – I simply missed that (always read the small print!).

Here’s the result:

The V-CLIP algorithm in 2D, shaded areas represent voronoi regions of the nearest features, the solid line the minimum positive distance

Let me shortly explain how the algorithm works: The basic theorem states that if X and Y are a pair of features on two separated shapes, then if X is contained in the Voronoi Region of Y and Y is contained in the Voronoi Region of X, X and Y are the closest features and thus must contain the closest points between those features. Knowing the closest points, we can then compute the distance between them and declare a collision when the distance falls below some small value.

In a physics simulation we are usually integrating over a small time step so we can assume that objects only move by a small amount (called spatial and temporal coherence). This also means the new pair of closest features is probably near the last one from the previous time step, so an almost constant query time is achieved when re-starting the algorithm using the last result.

Talking about performance, V-CLIP is roughly 2-3 times faster than SAT and even better – in most situations it performs independently of the number of polygon vertices – which makes sense since you are only looking at the “difference” from the last state rather than the whole picture. Unluckily, the last problem I’m facing preventing it from being useful is that you have to do some extra work in creating a good contact manifold, so at the end it’s actually slower than SAT. But I have not yet given up and have some ideas to get the job done faster.

If you are interested in the code, it’s included in motor2 since version 0.9.
I have now looked more closely at the open Lin-Canny algorithm which is very similar to the Voronoi Clip algorithm, especially in 2D where I don’t need some extra cases introduced by the paper. For now I have dumped the V-Clip classes from my engine (they weren’t used anyway) and will revise the algorithm in the future.

There also other interesting applications, for example it allows you to compute the time of impact (TOI) for conservative advancement.

[1] A Fast Algorithm for Incremental Distance Calculation
[2] V-Clip: Fast and Robust Polyhedral Collision Detection

Ace of Mace

January 9, 2009 on 5:03 pm | In Games, Links | 3 Comments

Last year some guys started to work on a game called Ace of Mace built on top of my physics engine. It was officially released yesterday and won a price at the Europrix Multimedia Award before. Today it’s featured as the Adobe Site of the day. Great work!

Speaking about motor2, version 0.9 was released at the end of last year, and now I’m working on it to get a major release done.

Collision detection for particle systems

July 13, 2008 on 11:16 pm | In Actionscript, Physics | 11 Comments

One of the things I’m working on for motor2 is the inclusion of a particle system, mainly for simulating bullets, fluids and soft bodies. The main challenge is to build a fast particle-polygon collision detection which would be able to compute the penetration depth and collision normal for a huge amount of particles. The containment check could be done with a binary search over the vertices of the polygon, an O(n log(n)) operation, but it doesn’t give you the information to push the particle outward.

After reading some papers I finally found the solution to my problem, namely a signed distance field.
A distance field is a scalar field that measures the distance from a given point to an object or data volume. Each element in a distance field specifies its minimum euclidean distance to the shape. Positive and negative distances are used to distinguish outside and inside of the shape (thus signed). This information is precomputed and stored with the shape.

Actually it’s very simple in 2D: First, the shape’s bounding box is divided into a regular grid. Second, each grid cell is analyzed – it can be either outside, inside or intersecting one or multiple edges. For the inside and outside case, a flag is stored with the cell. For the intersecting case, all piercing edges are found and converted into planes, stored in the point-normal form, n (X – P) = 0. At the end you are left with a very simple plane-point distance check.
A lower resolution requires less memory, but makes the lookup slower. The opposite is true for very small cells, because then each cell is likely to contain only one edge to test against. The demo below shows the ‘rasterized’ distance field:

In the next demo you see a ‘pseudo color representation’ of the distance field. As you see, no information is lost, and the original shape can be reconstructed by extracting a point-based contour at true Euclidean distance and any level of accuracy:

The last demo shows the minimum translation distance vector (MDT) obtained by evaluating the distance field. The vector, with its length and direction has the required information to push the particle outward if it’s contained by the polygon:

The steps are:

  1. Transform the particle to the polygon’s local space.
  2. Evaluate the distance field and resolve collision.
  3. Transform the new particle’s position back to world space.

4 dot products, 6 additions (step 1 + 3) and in the best case one array lookup and one dot product (step 2) are required. Of course, the method can fail if a particle moves very fast, in this case the particle’s movement should be modeled as a ray, which is then clipped against the polygon.

The source code for the distance field is not yet included in the motor2 svn, since it would mess up existing classes, but it will be very soon.

Using object pools

June 18, 2008 on 4:58 pm | In Actionscript, Physics | 39 Comments

Joa Ebert is right when he says that utilizing object pools can make your code perform a lot faster. An object pool is just a container for a bunch of pre-constructed objects that are kept in memory ready for use, rather than being repeatedly allocated and destroyed on demand.

Object pooling makes sense if:

  • you create dozens of short-lived objects in real-time applications like games
  • you need to store and share temporary data throughout complex algorithms
  • the objects are expensive to create (many fields, complex inheritance chain, nested objects)
  • the objects are expensive to remove (unregister listeners, nullify instances)

The only drawback is that memory consumption will raise, but with ridiculously low memory prices this shouldn’t be problem if used wisely ;-)

Implementation

So here is my ObjectPool.as manager class which is an attempt to create a lightweight, fast and reusable solution. The implementation is based on a circular list, and the API is very simple. Download: ObjectPool_v1.0.zip (source, example, asdoc)

EDIT
I have updated the class so it also accepts a factory for object construction.
Download: ObjectPool_v1.1.zip

First, we create the object pool:

var isDynamic:Boolean = true;
var size:int = 100;

var pool:ObjectPool = new ObjectPool(isDynamic);
pool.allocate(MyClass, size);

The isDynamic flag defines the behavior for an empty pool. If true, the pool automatically creates a new bunch of objects for you. If false, the class throws an Error to indicate that the pool is empty. The size value indicates the pool’s capacity – if the pool is dynamic, the pool grows by the initial size each time it becomes empty so it actually never dries up.

By calling the allocate method the pool is filled with 100 instances of MyClass. You can always reuse the pool for another Class by invoking this method again.

If you need to initialize the objects by passing arguments to it, you can do this with a little helper method called initialize:

pool.initialize("funcName", [arg0, arg1,...]);

This goes through every object and applies the function with the given arguments upon each object. This can also be done by reading each object, calling the function and putting it back:

for (var i:int = 0; i < pool.size; i++)
{
	var o:MyClass = pool.object;
	o.init(arg0, arg1, ...);
	pool.object = o;
}

Now to get an instance of MyClass you access the pool like this:

myObjectArray[i] = pool.instance;
//instead of
myObjectArray[i] = new MyClass();

When you are done with your object, instead of throwing it into the garbage collector, you "recycle" it for the next use:

pool.instance = myObjectArray[i];

This assumes that you are storing your instances in an array or something else because if you loose the reference, well it's lost and can't be reused anymore :-) And be careful not to assign the object twice, since then your pool would contain duplicates of the same object!

That's all, pretty simple right ?

Finally, there is the purge() method, which is only interesting for pools that are dynamic. As the pool grows with the demands of the application, it can get quite big. The purge methods scans the pool and removes all allocated but currently unused objects, leaving with a compact representation.

Demo

Here is a little demo which demonstrations how the pool works internally. Actually it's very simple. Pressing the RIGHT arrow key reads an object from the pool (first row), which is then stored in the second row beneath. Pressing the LEFT arrow key gives the object back to the pool. Pressing the ENTER key performs a purge() operation. The purple circle points to the node where the next object is read, the blue circle to an empty node where the insertion is performed.

Performance

Benchmarking revealed that it's always faster to cache instances, even for the generic Object class. All benchmarks were done with the release player 9.0.124 on Vista, an object pool size of 100 and with 100 iterations each to get an average value.

The purple bar indicates the time needed to access the pool:

for (var i:int = 0; i < k; i++) instances[i] = p.instance;

The blue bar measures the task of reading the objects, then putting them back:

for (i = 0; i < k; i++) instances[i] = p.instance;
for (i = 0; i < k; i++) p.instance = instances[i];

The grey bar shows the time needed for creating the objects on the fly:

for (i = 0; i < k; i++) instances[i] = new MyClass();

Here my result:

Caching a native flash Object can be almost 5x faster instead of creating it on the fly.

Almost the same applies to a slightly more complex object like the flash.geom.Point class.

Creating complex objects, here from the flash.display.Sprite class, is extremely slow and up
to 80x faster when pooling.

« Previous PageNext Page »

Proudly powered by WordPress Theme based upon Pool theme by Borja Fernandez.
Entries and comments feeds.