Wednesday, September 16, 2009

sizeof(char)

It does not seem to be common knowledge what sizeof(char) returns and why. So I will try to cover that.

If you're new to C, you may be assuming that a char is always 8 bits, and that sizeof(char) is thus always 1. It is correct that sizeof(1) is always equal to 1, but it's for another reason, and a char may be bigger than 8 bits.

So, thus if you're an intermediate C programmer, you may know this, and for example use malloc((n+1) * sizeof(char)). The multiplication is redundant, however, because as I stated above, sizeof(char) is always equal to 1. Why?

The reason is that sizeof() returns sizes on units of the size of char; that is, sizeof(char) must be equal to 1 as, well, a char is as big as a char. This also means that the sizes of all datatypes (with potential padding) must be in multiples of CHAR_BIT (a macro which expands to the number of bits in a char).

And as you might have guessed, the sizes passed to malloc et al are not in multiples of 8 bits, they are in units of sizeof(char).

In the standard, a char is defined to be the same size as a byte. This means that a byte can be bigger than 8 bits. There are examples of C compilers which define a char, and thus a byte, as 9, 16, 32 or 36 bits.

Monday, August 17, 2009

Splitting a directory based on first character in filename

This is a bash small script I whipped up:

#!/bin/bash

limit=26;

function do_move
{
if [ "x$selected" != "x" ]; then
if [ "$first" == "$last" ]; then
dir="$first"
else
dir="$first-$last"
fi
mkdir "$dir";
for j in $selected; do
j2=`echo "$j" | tr "[:upper:]" "[:lower:]"`
for k in $j* $j2*; do
if [ "$k" != "$dir" -a -f "$k" ]; then
mv "$k" "$dir/"
fi
done
done
fi
}

initials=`for i in *; do echo "$i" | tr "[:lower:]" "[:upper:]" | sed "s/^\(.\).*/\1/"; done | sort | uniq`;

cur=0;

first="";

selected="";

for i in $initials; do
if [ "$first" == "" ]; then
first=$i;
fi
selected="$selected $i"
numfiles=`ls|grep -i "^$i\+"|wc -l`;
(( cur = cur + $numfiles ));
last=$i
if [ $cur -ge $limit ]; then
do_move
cur=0;
first="";
selected="";
fi
done

do_move


It splits a directory into smaller directories based on the first character in the filename. An example:

$ for i in `seq 1 1000`; do touch `pwgen -n -c 10 1`; done
$ dirsplit
$ ls
A B-C D-E F-G H-I J-K L-M N-O P Q-S T U V-W X-Y Z


As you see, it splits into ranges when it can fit few enough files into one directory.
$limit as defined at the top of the script controls the number of files that should be exceeded for the script to deem it necessary to create another directory; that is, $limit is not a hard limit, there is no hard limit, and there can't be, unless you start splitting on more than the first character. In any case, adjust as necessary.
The algorithm currently used is greedy and likely quite non-optimal, but it works for me.

The script was made to be able to split directories with many files on the CF card of my NES PowerPak (which is a fabuluous creation, but that's another story).

Friday, August 14, 2009

Nightfall and selecting multiplayer levels over network

I just finished implementing support for calculating SHA1 sums of a level for Nightfall. Also, it can look up a level by SHA1 sum.
When a client joins a server, it is sent the SHA1 sum of the level the server loaded, and it can then find the appropriate level to load.
You may ask why I didn't just send the file name of the level. The reason is that if these two levels (which have the same file name) do not fully match, the client will likely go out of sync with the server sooner or later. There's no other way of detecting it.

This may need a short explanation of the networking protocol in Nightfall:
Every single small movement of a unit is not sent over the network. Instead, the overall commands the unit gets and the paths that are calculated for it are sent, and every client then calculates what small per-frame moves are necessary to achieve this goal.
Also, the map and other properties of the level are not sent over the network. In the future this may change, so that you can join a game which you do not have the level of, and your client would automatically download the level.

The main reason things are done like this is to reduce network traffic as much as possible, and thus Nightfall uses only a few kB/s of network traffic per client.

If a client and a server gets out of sync, currently the game just quits. In the future, the server could perhaps save the game as it has it, and then send the save to the clients for them to load it.
However, if the reason the game got out of sync was that the levels differ, this will obviously not solve the problem, and it may even trigger an infinite loop of re-syncs. Failing in this way is very user-unfriendly.
So that is why I implemented a way of making sure that the clients load the same level as the server did.

More PHP oddities

Apparently, true prints as 1, while false prints as the empty string.

Also, empty() considers the string and the number 0 to be empty.

Five lectures on the acoustics of the piano

Pretty interesting stuff:

http://www.speech.kth.se/music/5_lectures/contents.html

Tuesday, July 28, 2009

LVM First Encounter: Fatal Mistake

Don't put / or /boot on LVM. It won't work.

Sunday, July 26, 2009

Local "DOS" against the NVidia Linux driver

Because of a bug, I found out that looping on glClear causes very painful results with the NVidia Linux driver. It makes the desktop unusable, updating only every 2 seconds, and no events will reach any program. Only way to get out of it is to switch to a tty, then events seem to be transmitted, and the program will be exited. X crashed once when I switched to a tty, tho. Try out on your own risk.

EDIT: Warning: Seems that I didn't fix the bug in the program, so I got into the same situation again (however, my program runs a few gl statements in between every glClear) and the machine became so unresponsive that I couldn't even switch to a tty (had no second computer to ssh in with, that would likely have solved it). I repeat, run it at your own risk.

Here's a test-case:
#include <sdl.h>
#include <gl.h>

int main(int argc, char *argv[])
{
SDL_Surface *screen;
int i = 0;

SDL_Init(SDL_INIT_VIDEO);

screen = SDL_SetVideoMode( 640, 480, 0, SDL_OPENGL | SDL_GL_DOUBLEBUFFER );

glMatrixMode(GL_MODELVIEW);

while (1)
{
glClear(GL_COLOR_BUFFER_BIT | GL_DEPTH_BUFFER_BIT | GL_STENCIL_BUFFER_BIT);
}
SDL_Quit();

return 0;
}

Note that I am running a rather old version of the NVidia Linux driver, 173.14.09, they could very well have fixed this in newer versions.

Definition of brainlessness

1. glClear
2. glLoadIdentity
3. Switch to ortho projection
4. glClear
5. glLoadIdentity
6. Render everything
7. Swap buffers
8. Restore to viewport that was before switching to ortho
9. Swap buffers

Tuesday, July 14, 2009

Pocoproject

I stumbled upon this today, while looking for a platform-independent library to handle zip files:

http://pocoproject.org/

It appears to have quite a few more things that would be handy, like an XML DOM implementation, crypto routines (there must be hashing algos in there, right? I want those!), threading library and network programming.

I'll take a look on this, but it's possible that I will use it in a long-time project of mine: Nightfall, a free RTS game. See http://nightfall-rts.org/.

I'm currently using SDL and SDL_net for threading and networking, respectively, and I have my own tiny XML library (as I didn't want to drag in extra dependencies). SDL threading is not very advanced, and SDL_net doesn't support ipv6. The small XML library is quite limited.

Sunday, July 12, 2009

PHP and fgets()

In an attempt to mimic behaviour of fgets in C, it appears that PHP defines the $length argument as the number of characters - 1 that should be read.
The length argument to fgets in C includes the '\0', but PHP really has no reason to care about that issue.

Friday, July 10, 2009

Best slogan ever

http://www.fckeditor.net/: "FCKEditor - The text editor for Internet"

"Look, ma, I'm editing the Internet!"

Sunday, June 28, 2009

GDB 7 will bring major improvements

According to GDB and Debian Developer Daniel Jacobowitz, GDB 7 will bring major improvements:

Improvements mentioned are:
  • Support for understanding inlined functions -- that is, you will actually be able to step in, through and out of them, and get backtraces that include inlined functions!
  • Support for pretty-printing STL containers etc!
  • Python scripting.
So it appears that this release will be a great one. It's slated for this fall.

Oh, and on another sidenote -- I must promote cgdb again, it simplifies debugging with gdb a lot. It will also benefit from these improvements. It could be that it could need a feature or two to make these new features easier to use (python scripting, perhaps?). Sadly, upstream of cgdb appears to be quite dead; no release since 2007.

Function returning function pointer: funny syntax

Take a look at this:

void (*f())(void*);

At a first glance, it may look like some kind of special function that returns void and takes a void*. That's totally wrong. It's actually a function that takes zero arguments, and returns function pointer, a void (*)(void*) -- that is, a pointer to a function pointer that returns void and takes a void*. The syntax actually makes a little sense if you consider the syntax for defining a variable containing such a function:

void (*f)(void*);

Typedefs of function pointers are weird too:

typedef void (*ptrFunc)(void*);

But I can't help to think that the following syntax would be much clearer:

void (*)(void*) f(); // invented syntax
void (*)(void*) f;
typedef void (*)(void*) ptrFunc;

These would make a lot more sense than the current function pointer syntax in C/C++, and would 'logically match' other declarations -- [return] type followed by an identifier (followed by argument list if it is a function), instead of mixing [return] type, identifier and argument list into one big mess.
It may be that the invented syntax is harder to parse -- I have thought a while on this, but haven't come to a reliable conclusion. The syntax may just be a historical artifact.

In any case, this odd syntax must be the reason why you usually define typedefs for function pointers, as done above. Then we can use the following syntax:

ptrFunc f();
ptrFunc f;

Much cleaner and understandable -- the obvious drawback is: what's a ptrFunc? You'll have to find the typedef to know. But I think that's worth it.

Surprise with std::map.operator [] and thread safety

A year ago or so, I got a surprise using std::map.operator [] in a multithreaded program -- the program kept crashing while in subscripting the map, even though I was just reading and nor writing -- I thought!

I looked it up today while peeking at my old code, and discovered that std::map.operator [] inserts a data_type() for the specific key if it does not exist -- which surely explains the crashes I got. The reason I was able to use it like this, was that I was storing a std::map to record valid pointers -- operator [] would thus return false for pointers that weren't keys in the map.

Lesson learnt? Use std::map.find() instead!

Hello again!

So I, umh, decided to start blogging again. It's been a while.