Storage Informer
Storage Informer

Highlights and Challenges During Ghostbusters Development, Part III

by on Jun.30, 2009, under Storage

Highlights and Challenges During Ghostbusters Development, Part III

Game Optimization Challenges for Modern Hardware

Although we seem to have hit a ~3GHz limit in processor speed, Moore&aposs law may still be holding as more and more cores are added to a processor at this speed. As processors have gotten faster and faster, memory latency has gotten longer and longer over time. This means that understanding how the architecture you are working on accesses memory is critical to the execution speed of your program, and for the first time, maybe even more important than the algorithms themselves. Let me give you an example…

For collision detection with actors that move in the world, every level in Ghostbusters (and the Infernal Engine) is partitioned with a BSP tree. Each node in the tree contains a linked list of actor pointers that are in that node. Note that actors may not be in the leaves of the tree, but may be pushed up until one node fully contains the actor.

Here is a simplifed example of what our world and actors looked like in previous generations of code…

class CActor {
.
.
.
CVector wPos;
float wRadius;
.
.
.
CActor *nextActorInBSP;
.
.
.
};

struct SBSPNode {
float a,b,c,d;
SBSPNode *left,*right;
CActor *firstActorInBSP;
};

void raytraceActorsInNode(SBSPNode *node,CVector *wRayStart,CVector *wRayEnd) {
CActor *a = node->firstActorInBSP;
while (a != 0) {
wPos,a->wRadius,wRayStart,wRayEnd);]]>
a = a->nextActorInBsp;
}
}

So this is a very simple code example running through a linked list of actors and performing a simplified raytrace on them. The problem with this code though is execution speed. While this type of code worked well in previous years, it does not peform well on modern hardware due to the potential of cache misses running through a linked list. Certain hardware can have over a 500 clock tick penalty for a L2 cache miss and a 37 clock penalty for a L1 miss. Neither is acceptable if you want to run your code at 3GHz. To insure we run at maximum speed, we rearrange our structures as follows:

New BSP structure:

struct SActorPosRad {
CActor *actor;
CVector wPos;
float wRadius;
};

struct SBSPNode {
float a,b,c,d;
SBSPNode *left,*right;
Array actorList;
};

void raytraceActorsInNode(SBSPNode *node,CVector *wRayStart,CVector *wRayEnd) {
int i;
for (i =0; i wPos,a->wRadius,wRayStart,wRayEnd);
}
}

The array template is a very simple dynamic memory allocator. As long as actors don't move very far in one frame of a game, using a template array here mostly doesn't change from frame to frame. Insertion and deletion from the array is treated as a linear list, and the array is preallocated to a minimum size, so that allocation (and deallocation) will almost never happen. This could even be a static sized array, with actors overflowing this node pushed up in the tree.

This is a case where we will have to unlearn what has been taught in computer science classes for decades – the use of a linked list – may not be the fastest way to store data in memory anymore. We're violating multiple rules we've been taught – not using a linked list, and using a linear search for inserting and deleting actors in this list. Linear searches are slow, right?

Linear searches (if used properly) can be extremely fast due to the memory access pattern. The cache line on a modern system is anywhere from 64 bytes to 128 bytes in size. That means that to read any memory location in that vicinity, the memory around that will also have to be read in. So you have free full speed access to the data around you as well. If you are constantly hopping around in memory such as a linked list does, you will be stalled while the memory around your pointer is brought in as it is dereferenced.

So while linear searches may not be applicable for all types of data, they can certainly be used for tight game code.

:
, , , , , , , , ,

Leave a Reply

Powered by WP Hashcash

Looking for something?

Use the form below to search the site:

Still not finding what you're looking for? Drop a comment on a post or contact us so we can take care of it!

Visit our friends!

A few highly recommended friends...