Fast math
Pretty soon after spinning his first cube, the 3D programmer realizes he needs a primitive library to handle all the math voodoo. The first primitive he probably wants is the vector. At the core a vector is just three values:
struct vector3f {
float x, y, z;
}
A very common thing to need to do to a vector is normalization. Normalizing a vector can be written as follows:
void normalize() {
float length = sqrt((x * x) + (y * y) + (z * z));
x /= length;
y /= length;
z /= length;
}
Hmm, looks like three multiplications, a square root, three divisions... a decently expensive function. If we want more speed we can use SSE. Basically SSE is a set of instructions and registers built into the CPU that we can use to perform common math routines. For example, the __mm_mul_ps function gives us four multiplications with just one instruction, and __mm_div_ps similarly gives us four divisions. So while using SSE can be a pain since you are basically working in assembly, it can offer some serious speed improvements.
I think it's good for a programmer to write his own primitive routines first to better understand the math involved. Especially when it comes to something nasty like quaternions, I would have never understood them without manually coding each operation. However, at some point I thought, "Surely this has been done before, and by people who are much better programmers than me." Especially when it comes to topics like SSE, there is no way I could do it justice. This brought me to the Sony Vectormath library. This library has primarily been in use by the Playstation 3 SDK, but was also donated to the Bullet Physics project for everyone to use.
I wrote a basic app to see just how much of a speed increase I might see in math heavy routines. It spins 4096 cubes on a random axis 60 times a second. The result looks pretty sweet, and I've included videos at the bottom of the post. Check out these numbers:
Sony w/ SSE : 0.000485
Sony w/o SSE : 0.000796
sinprim: 0.000837
The SSE lib outperforms my naive implementation by almost double! Granted, most routines are not going to be so purposefully math heavy as this test, but the room for improvement is there. One thing to keep in mind is that an executable compiled with SSE can only run on a CPU that supports it. Personally I will use up to SSE2 until I run into trouble, since the Pentium 4 and pretty much everything after it supports it.
You can check out the latest version of the Sony vector math library at the Bullet Physics svn:
svn co http://bullet.googlecode.com/svn/trunk/Extras/vectormathlibrary
My changes to the library to get it to work with GCC are here:
http://sinoth.net/code/vectormath_lib.zip
Now the fancy vids :)
We require more vespene gas
I've been playing waay too much Starcraft2 beta :) Luckily, I got whipped in the ladder badly enough to convince myself to give up for now. Still, it's enjoyable... I've always been a Starcraft fan, especially when I can play with friends. The engine and interface feel tight.. once you know your way around, you can issue orders insanely fast. Definitely a game to keep in mind when designing an interface.
Past few days I've spent working on my sinsocket class. Needed to extend it to allow easy compression/decompression of data. Also cleaned up a lot of the error checking since my Sculpt server is gonna need to be rock solid. I'm getting tired of just working on libraries, but I believe this is time well spent. Having a class available that allows easy networking is critical to me because I'm big on connectivity -- hell, I have an ethernet port tattooed on my wrist :P No matter how simple the project, if I think it'd be fun to add active or passive multiplayer elements, I want to be able to do it without much fuss.
I'm close to finishing my updater library, which will allow my projects to automatically patch with minimal effort on the user's part. It has been fun to work on but I'm ready to get an actual project out. All this behind the scenes stuff means very little feedback and external motivation. I'd really like to have a Sculpt release out by this weekend, but I'm not sure I'll have enough time. All depends on how soon I can get the updater running.
7DRL3 - Wrapping it up
Well, I've almost recovered from 7DRL madness. It was interesting having much more time to throw around... we easily spent 10+ hours each day on the project. The week of intense development will hopefully rub off on my coding habits for the next few months :P I posted a postmortem for the project here.
7DRL3 - Day 2 and 3
Day 2 was pretty uneventful. More thinking and planning than actual coding. In fact, I felt pretty bad about lack of progress after day 2. Interestingly enough, we didn't really get to coding until after some beer. I think maybe being too wired (coffee) can have a negative impact on productivity if you're still in the decision making phase. For example, I spent probably 4 hours reading, reading, and reading about different ways to do what we're doing, and what is more "standard". Wondering if we should use std::string instead of char[]. Exploring every alternative for the message system. Whether C++0x would make anything we're doing easier. Good reading, but time that could have been spent actually TRYING some of these methods. There is merit to careful planning, but at some point you get diminishing returns and you need to just jump in. Apparently beer does the trick ;)
I'm sorta bummed about our message system that different parts of the engine use to communicate. I hoped to minimize the amount of memory allocating/deallocating/copying we have to do manually, but the idea fell short and we're back to the same old memcpy fun. If I have time (haha) I'll revisit the message system and attempt to make it less hideous.
Day 3 was a lot better. We have most of the 'major' decisions made and a good framework to deal with. We ALMOST have a basic battle up. Hopefully tomorrow we get a battle going and can have some screenshots to post :)

