Wednesday, September 16, 2009

sizeof(char)

It does not seem to be common knowledge what sizeof(char) returns and why. So I will try to cover that.

If you're new to C, you may be assuming that a char is always 8 bits, and that sizeof(char) is thus always 1. It is correct that sizeof(1) is always equal to 1, but it's for another reason, and a char may be bigger than 8 bits.

So, thus if you're an intermediate C programmer, you may know this, and for example use malloc((n+1) * sizeof(char)). The multiplication is redundant, however, because as I stated above, sizeof(char) is always equal to 1. Why?

The reason is that sizeof() returns sizes on units of the size of char; that is, sizeof(char) must be equal to 1 as, well, a char is as big as a char. This also means that the sizes of all datatypes (with potential padding) must be in multiples of CHAR_BIT (a macro which expands to the number of bits in a char).

And as you might have guessed, the sizes passed to malloc et al are not in multiples of 8 bits, they are in units of sizeof(char).

In the standard, a char is defined to be the same size as a byte. This means that a byte can be bigger than 8 bits. There are examples of C compilers which define a char, and thus a byte, as 9, 16, 32 or 36 bits.