# Thread: what's wrong with this code?

1. ## what's wrong with this code?

Hi all,
I wanted to write a simple function to count the set bits in an integer:

Code:
```int countSetBits(int bitSet){
int counter = 0;
while(bitSet > 0){
if(bitSet & 1){
Counter++;
bitSet >>= 1;
}
}
return counter;
}

int main(){

int x = 64;
int y = countSetBits(x);
printf("%d",y);

return 0;

}```
Seems legit to me.. but for some reason , with some integers (like 64 , 10..) i stuck in a loop
(With the value 7 it works ..probably because all is 1's).

So i wrote it like this:
Code:
```int countSetBits(int bitSet){
int i;
for(i = 0; bitSet; bitSet >>= 1){
i += bitSet & 1;
}
return i;
}```
And now it's working! What's wrong with the first version?
(Its probably something really stupid, but i just can't see it).
Thx.

2. In the first version, bitSet is being shifted only if the right-most bit is a 1. Thus as soon as it's not, you'll just test the same value again, ad infinitum. You want to shift each iteration unconditionally, to ensure you check each bit position no more than once.

3. Yep.. if i would wait for another minute before writing this post i would probably see it myself . So stupid of me..

4. Another cool way to do this is to keep the variable in place, and just shift a 1 over and check it, with a for loop making the number of shifts less each time. This way the variable you are checking doesn't need to be pushed off the cliff:

Code:
```int binary(const int val)
{
int num1 = 0;

for(int i =((sizeof(val)*8)-1); i >= 0; i--){
if(val & (1 << i)){
printf("1");
++num1;
}
else
printf("0");
}
putchar('\n');
return num1;
}```

5. Alpo: That code is not very portable and it's relying on undefined behaviour. sizeof(val) * 8 is not guaranteed to give you the number of value bits in an int since a byte could consist of more than 8 bits and integers are not required to use all the bits for value representation either. This part will most likely result in a signed overflow which is undefined behaviour:
Code:
`1 << i`
The standard says the following about left shifting signed integers:
The result of E1 << E2 is E1 left-shifted E2 bit positions; vacated bits are ﬁlled with zeros. If E1 has an unsigned type, the value of the result is E1 × 2E2, reduced modulo one more than the maximum value representable in the result type. If E1 has a signed type and nonnegative value, and E1 × 2^E2 is representable in the result type, then that is the resulting value; otherwise, the behavior is undeﬁned.
Assuming that your non-portable assumption about the number of a value bits in an int is correct, then your code will result in the sign bit being set which leads to an overflow. In general, you shouldn't be left shifting signed integers at all since it's hard to do it safely.

6. Assuming that your non-portable assumption about the number of a value bits in an int is correct, then your code will result in the sign bit being set which leads to an overflow.
O_o

The assumption that `1 << ((sizeof(type) * 8) - 1)' is the "sign bit" is itself not a portable assumption.

Actually, the assumption that `int' has a "sign bit" is not a portable assumption.

@Alpo: You should look into `CHAR_BIT' from the "limits.h"/"climits" header.

Soma

7. Originally Posted by phantomotap
The assumption that `1 << ((sizeof(type) * 8) - 1)' is the "sign bit" is itself not a portable assumption.
Where else would it be if we know that the number of value bits is sizeof(type) * 8 - 1? Can you give an example of that?

Originally Posted by phantomotap
Actually, the assumption that `int' has a "sign bit" is not a portable assumption.
That's true but you should always assume the worst unless you know for sure what the implementation looks like. The worst case scenario also happens to be what most popular implementations do.

Originally Posted by phantomotap
@Alpo: You should look into `CHAR_BIT' from the "limits.h"/"climits" header.
How can you use CHAR_BIT to portably figure out the number of value bits in an int? What about padding bits etc? As far as I know CHAR_BIT doesn't tell us anything about how an int is represented.

8. Originally Posted by phantomotap
Actually, the assumption that `int' has a "sign bit" is not a portable assumption.
I think it is a pretty reasonable assumption though:
Originally Posted by C99 Clause 6.2.6.2 Paragraph 2a,b
For signed integer types, the bits of the object representation shall be divided into three groups: value bits, padding bits, and the sign bit. There need not be any padding bits; there shall be exactly one sign bit.

9. Originally Posted by phantomotap
The assumption that `1 << ((sizeof(type) * 8) - 1)' is the "sign bit" is itself not a portable assumption.
Are you saying that an a representation like this would be valid (the x represents padding or parity bits, the 1 represents value bits and the S is the sign bit)?
xxxxxxxx 11111111 (unsigned version)
Sxxxxxxx x1111111 (signed version)

I guess it doesn't really matter if 1 << ((sizeof(type) * 8) - 1) accesses the sign bit or not since we're already at the point of undefined behaviour here since the value of that computation "isn't representable in the result type".

10. How can you use CHAR_BIT to portably figure out the number of value bits in an int? What about padding bits etc? As far as I know CHAR_BIT doesn't tell us anything about how an int is represented.
O_o

As I've told you several times, looking solely at the standard is a pointless exercise.

The standard is an invaluable tool for many, but you must also account for the "real world".

I think it is a pretty reasonable assumption though:
As I've said several times, assuming that the environment has an 8 bit byte is perfectly reasonable. (Actually, you've also made that exact comment.) If you are going to take one perfectly reasonable assumption related to the portability of platform characteristics off the table, you can't very well rely on a different one. Sure, the standard requires "exactly one sign bit" for signed integers. However, the standard doesn't say where the "exactly one sign bit" should live. In the "real world", we assume the "sign bit" lives in "MSB" because the vast majority of hardware conforms to the expectation. I see no issue with assuming an 8 bit byte in the "real world" because the overwhelming majority of hardware conforms to the expectation, and noting those common environments doesn't even begin to address the massive number of protocols, formats, and related standards that also depend on an 8 bit byte.

Soma

11. Originally Posted by phantomotap
As I've told you several times, looking solely at the standard is a pointless exercise.

The standard is an invaluable tool for many, but you must also account for the "real world".
Even if you ultimately decide to make certain assumptions about your target platform then it certainly doesn't hurt to understand what the standard actually says about what you're doing. Looking at what the standard says as a strict learning exercise is not pointless in my opinion.

12. Related standard question.
I have found that thinking "char" is always "signed" or "unsigned" leads to writing non-portable code.
(In other words, I try to place "signed" or "unsigned" in front of "char" in all the places I can.)

Is the same true for "int", I was thinking it always defaults to "signed" for standard C compilers, is this true?

Tim S.

13. I have been bitten once or twice in the past by assuming "char" was signed by default.

If I am interpreting the C99 draft standard correctly, then it seems that whether plain "char" defaults to signed or unsigned is implementation defined, whereas plain "int" defaults to signed.

Regarding the implementation defined type of plain "char":

6.2.5 Types

15 - The three types char, signed char, and unsigned char are collectively called the character types. The implementation shall define char to have the same range, representation, and behavior as either signed char or unsigned char.
J.3 Implementation-defined behavior

Which of signed char or unsigned char has the same range, representation, and behavior as ‘‘plain’’ char (6.2.5, 6.3.1.1)
Regarding plain "int":

6.2.5 Types

4 - There are five standard signed integer types, designated as signed char, short int, int, long int, and long long int. (These and other types may be designated in several additional ways, as described in 6.7.2.) There may also be implementation-defined extended signed integer types.28) The standard and extended signed integer types are collectively called signed integer types.29)
6.7.2 Type Specifiers

2 - At least one type specifier shall be given in the declaration specifiers in each declaration, and in the specifier-qualifier list in each struct declaration and type name. Each list of type specifiers shall be one of the following sets (delimited by commas, when there is more than one set on a line); the type specifiers may occur in any order, possibly intermixed with the other declaration specifiers.

— void
— char
— signed char
— unsigned char
— short, signed short, short int, or signed short int
— unsigned short, or unsigned short int
int, signed, or signed int
unsigned, or unsigned int
— long, signed long, long int, or signed long int
— unsigned long, or unsigned long int
— long long, signed long long, long long int, or signed long long int
— unsigned long long, or unsigned long long int
— float
— double
— long double
— _Bool
— float _Complex
— double _Complex
— long double _Complex
— struct or union specifier
— enum specifier
— typedef name
(source)

14. O_o

I have the same interpretation as Matticus.

Soma

15. I would have made the parameter unsigned and avoided the whole question.

Right shifting a signed int is implementation defined, but usually replicates the sign bit (arithmetic shift), which means if you feed this algorithm a negative number the loop will never terminate since the arithmetic right shift of a signed non-zero value is never zero...

The question of which bit is the sign bit, or where the sign bit is, or even if there is a sign bit at all, is irrelevant to the goal of counting bits, so it was a mistake in the first place to use a signed int