# Thread: Question related to scanf's implementation.

1. ## Question related to scanf's implementation.

Code:
`scanf("&#37;d", &my_integer);`
How does scanf convert the characters from stdin into an int (in this case, my integer)? I know it's possible to do this:

Code:
```//Assume that read_a_string_from(stdin) exists.
//Also assume that it contains nothing but a positive number; this code won't work properly for negative numbers.
int result = 0;
for (i = 0; i < strlen(my_string); ++i) {
int current_digit = (int)my_string[i];
result = result*10 + current_digit;
}
return result;```
Is this what scanf actually does? If not, how is it implemented to handle integers?

Thank you in advance... And I hope this makes sense.

2. You could always grab the source code for the GNU GlibC implementation.

But you're partly on the right track.

3. Code:
`result *= 10 + current_digit;`
That's the same as
Code:
`result = result * (10 + current_digit);`
But you probably want
Code:
`result = (result * 10) + current_digit;`
So perhaps
Code:
```result *= 10;
result += current_digit;```
I'm sure if you searched around you'd find lots of code that converts strings to numbers. Here's a clone of atof() I wrote recently: http://cboard.cprogramming.com/showp...5&postcount=17
Note that I did not test that code, and that instead of converting integers it converts floating-point numbers in general form (I think that's what they're called . . .) and not 1.3E-4. It's not the best example, I just posted it because it's fresh in my memory.

4. Originally Posted by dwks
Code:
`result = result * (10 + current_digit);`
Right. My mistake. Fixed.

Originally Posted by dwks
I'm sure if you searched around you'd find lots of code that converts strings to numbers.
I'm aware of that. However, I was interested in scanf's code in particular. It's just out of curiosity.

I'm currently looking at glibc. scanf's code seems pretty complicated. But I'll do my best to figure it out. Thanks.

5. Unfortunately, the implementation of GLIBC's scanf isn't entirely straightforward (well, scanf itself is very trivial, as it just does some stuff to get hold of the arguements, and then calls the generic version with stdin as one of the parameters - but the generic code is definitely not straight forward, not particularly helped by the fact that the code is full of #if-statements that excludes some bits from being compiled in, depending on which version of scanf is intended (e.g. widechar or "narrow" char, as well as various flags to support varying options of compilers and OS architectures etc).

The source is here:
http://sourceware.org/cgi-bin/cvsweb...&cvsroot=glibc

And the bit that actually converts the number for %d, etc, looks like this:
Code:
```            {
if (flags & NUMBER_SIGNED)
num.l = __strtol_internal (wp, &tw, base, flags & GROUP);
else
num.ul = __strtoul_internal (wp, &tw, base, flags & GROUP);
}```
But that is many many lines after the bit of code that says "is this %d", and the code in between is related to the input and conversion of the number too.

--
Mats

6. Originally Posted by matsp
Code:
```            {
if (flags & NUMBER_SIGNED)
num.l = __strtol_internal (wp, &tw, base, flags & GROUP);
else
num.ul = __strtoul_internal (wp, &tw, base, flags & GROUP);
}```
Mats
Which leads to strtol's implementation, which I can find in glibc's strtol_l.c
Their implementation seems to be (somewhat) equivalent to mine:

Code:
```use_long:
i *= (unsigned LONG int) base;
i += c;
//Then they test for errors and whatnot... Then this:
return negative ? -i : i;```
Assuming, of course, base is 10, which is what the %d conversion expects. Of course, they also have code to handle other bases.

7. Originally Posted by Mr_Miguel
Which leads to strtol's implementation, which I can find in glibc's strtol_l.c
Their implementation seems to be (somewhat) equivalent to mine:

Code:
```use_long:
i *= (unsigned LONG int) base;
i += c;
//Then they test for errors and whatnot... Then this:
return negative ? -i : i;```
Assuming, of course, base is 10, which is what the %d conversion expects. Of course, they also have code to handle other bases.
Yes, there's not that many ways that you can convert a string to an integer - there are some more convoluted or stupid ways, but essentially, walk over the string, keep multiplying by base and add the current digit, until there are no more digits.

Float isn't much worse, but after the decimal separator, you need to keep track of a multiplier and divide the multiplier by 10 for each digit, and add "current digit times multiplier" (there are other ways, but that's roughly what it amounts to).

--
Mats

8. The implementation of C standard functions is obviously implementation specific. It only makes sense to ask about specific implementations.

For all we know, scanf() converts digits to integers by writing them on a card and passing it to a midget in a cedar chest, who computes the answer and delivers it via carrier pigeon.

9. Originally Posted by brewbuck
The implementation of C standard functions is obviously implementation specific. It only makes sense to ask about specific implementations.

For all we know, scanf() converts digits to integers by writing them on a card and passing it to a midget in a cedar chest, who computes the answer and delivers it via carrier pigeon.
Yes, that's entirely true. I'm sure that Microsofts C-library code is also quite complex, but it's most likely also quite different from the glibc code, and both of those are different from the C-library supplied by Borland for their Turbo-C compiler, of course.

Edit: and as far as I'm aware, none of the three C-libraries use cards, midgets or cedar chests, but one can never really know for sure.

--
Mats

10. I know what you both mean, but one might question: isn't the algorithm to convert digits the same in all of them, despite the differences in code? Or do those code differences actually lead to different algorithms?

Personally, I don't know any other algorithm than the one that has been seen on this thread. Regardless, glibc's implementation has somewhat satisfied my curiosity.

11. Originally Posted by Mr_Miguel
I know what you both mean, but one might question: isn't the algorithm to convert digits the same in all of them, despite the differences in code? Or do those code differences actually lead to different algorithms?

Personally, I don't know any other algorithm than the one that has been seen on this thread. Regardless, glibc's implementation has somewhat satisfied my curiosity.
I would expect that it's fairly similar, but perhaps not identical. There aren't that many different ways to convert a string into a number (excluding methods involving midgets and cedar chests or similar exotic solutions). You could of course find a scanf() where the number conversion isn't implemented as an external function, but as an inlined piece of conversion code, but the essence of it will be "multiply by base, add digit".

--
Mats