I don't want it to drag on for days either . But I do want you to actually learn, and I think that can happen in much less than a day. I'm trying to get you to try this out on paper and go through the code by hand. Pretend you are the computer interpreting the programming instructions in your function. That is how you analyze code. It is a skill you must develop, it is critical to any programming work you ever have to do. First, a quick announcement:
The comments in my code from post #5 say 100. That is incorrect, they should say 10.
Now, to your questions:
> What is the '\0' do?
It looks like it is supposed to be a "null terminator". A string in C is just a bunch of sequential bytes that ends with a zero byte, or null terminator. All a character is in C is a tiny (8-bit/1-byte) integer. The character '\0' is a character (byte) with a numeric value 0. In the context of strings in C, it marks the end of the string. So it appears that this function attempts to store a string of some sort in buf. That's why I picked letters to store, because strings usually contain human-readable data.
> Ok, so when you say assign letters etc.....The array is 10 or the length of the buffer is 10. Which means....0,1,2,3,4,5,6,7,8,9 ......right?
I declared the array to have 10 elements. The valid indexes (numbers I can put inside the [ ]) are 0 through 9, so you are correct there. You can say the array has 10 elements, or the buffer has length 10, they both mean the same thing. If you ever try to do mybuf[-1] or mybuf[10], that is a buffer "overflow" or "overrun" (some people call negative indexes "underflows" or "underruns").
> From what I comprehended in your advice....i finishes with the value of 9. And I presume the buffer (buf[i]) would contain the letter j??????
You're half right. i doesn't "finish" with the value 9, but buf[9] would contain a 'j'. Here's one way to make buf contain the alphabet*:
Code:
int grab_request (char buf[], int buf_len)
{
int i;
for (i=0; i< buf_len; i++) {
buf[i] = 'a' + i; // this will put 'a' in buf[0], 'b' in buf[1], 'c' in buf[2], etc
}
buf[i] = ‘\0’;
return i;
}
So a common way to analyze and step through code by hand is to make a table where each column is a variable or expression from your code, and each row represents the state of all the variables after each instruction in the code is processed.
Code:
statement executed | i | buf | buf_len | i< buf_len
-------------------------+--------+---------------------+---------+------------
start of function | ? | ? | 10 | ?
line 4: i=0; | 0 | ? | 10 | 1 (true)
line 4: i< buf_len | 0 | ? | 10 | 1 (true)
line 5: buf[0] = 'a' + 0 | 0 | 'a', ?, ?, ... | 10 | 1 (true)
back to top of loop | | | |
line 4: i++ | 1 | 'a', ?, ?, ... | 10 | 1 (true)
line 4: i< buf_len | 1 | 'a', ?, ?, ... | 10 | 1 (true)
line 5: buf[1] = 'a' + 1 | 1 | 'a', 'b', ?, ?, ... | 10 | 1 (true)
back to top of loop | | | |
line 4: i++ | 2 | 'a', 'b', ?, ?, ... | 10 | 1 (true)
line 4: i< buf_len | 2 | 'a', 'b', ?, ?, ... | 10 | 1 (true)
line 5: buf[1] = 'a' + 1 | 2 | 'a', 'b', 'c', ?,...| 10 | 1 (true)
back to top of loop | | | |
The question marks represent unknown values, usually from variables that haven't been initialized yet. Remember, when you pass an array, the original array (mybuf), and the parameter you pass it in to (buf) are the same array, they refer to the same exact place in memory. So if you change buf in your function, you change mybuf too. Initially, all 10 spots in mybuf are uninitialized, therefore buf is also uninitialized. As we go through our loop, we start putting known values into the spots in buf, hence we know 'a', 'b', 'c', etc as we keep repeating our loop.
Remember, buf[9] is the last spot in the array, buf[10] is wrong. Also remember, your for loop will continue until the i< buf_len part of your for loop is checked, and it is 0 (false). Only after that will it stop.
Just keep on filling in the rows of that table until you get to the return i; line of code. Once you do this a few times, it becomes much easier and you can do it in your head. Then this stuff goes much quicker. Hopefully that clears up what you are supposed to be doing here and helps you figure out what vulnerabilities there might be and how to fix them.
As for analyzing integer overflow, the process is similar. Pick some test values for buf_len (like those I mentioned in post #4) and go through your table. Does i ever exceed the maximum integer value? Note, you can pretend that INT_MAX is something small like 5 and INT_MIN is -5, so you don't have billions of lines in your table. You just want to be sure that, for any possible value of buf_len, i never goes beyond 5 in any way, that it never "makes it to 6", which would signify an overflow.
I hope that's clear and helpful -- my brain is starting to shut down. It's getting late here, I worked a 14 hour day and have laundry to finish. Not sure if I'll check this again before morning, but that should give you plenty to chew on.
* This only works with ASCII or compatible character sets (where the letters of the alphabet are sequential), and could result in problems if buf_len was large (more than 26), but we wont worry about that for now.