# Thread: Input Validation

1. ## Input Validation

I'm trying to validate my input. I require for the user to enter six doubles and if they don't then I want them to re-enter the information. Here is my code:

Code:
```while (1>0)
{
printf("Please enter arguments in the order: negative mass, positive mass, initial x-position, initial y-position, initial x-velocity, initial y-velocity:\n");

if ( scanf("%lf %lf %lf %lf %lf %lf",&MassMinus,&MassPlus,&Pos[0][0],&Pos[0][1],&Vel[0][0],&Vel[0][1]) != 6)
{
printf("Not all numbers were assigned!\n");
continue;
}
if (MassMinus < 0 || MassPlus < 0)
{
printf("Masses must be positive!\n");
continue;
}
if ( (Pos[0][0]-1)*(Pos[0][0]-1)+Pos[0][1]*Pos[0][1] < Error*Error)
{
printf("Initial position is too close to the positive mass and will collide instantly!\n");
continue;
}
if ( (Pos[0][0]+1)*(Pos[0][0]+1)+Pos[0][1]*Pos[0][1] < Error*Error)
{
printf("Initial position is too close to the negative mass and will collide instantly!\n");
continue;
}
break;
}```
At the moment it just waits if you enter less than six numbers and if you enter any more than 6 it just ignores anything after the sixth number (so pretty much does nothing). Also if I entered 1 2 a b 3 4 instead of entering six numbers it would register that as 1 2 0 0 3 4 but I want it to make the user input the numbers again. I'm also aware that "while (1>0)" isn't good programming form but I'm not really sure what to use instead? Any advice on any of that would be really useful!

2. Originally Posted by thay2302
At the moment it just waits if you enter less than six numbers and if you enter any more than 6 it just ignores anything after the sixth number (so pretty much does nothing).
That's not true. Anything after the 6th value will not be read by the first scanf() call, but later checks cause the program to loop back, and they will be read in next time through the loop.

Originally Posted by thay2302
Also if I entered 1 2 a b 3 4 instead of entering six numbers it would register that as 1 2 0 0 3 4 but I want it to make the user input the numbers again.
Also not true. If scanf() encounters the 'a' it will stop immediately and return, with a return value other than 6. The 'a' will be left pending to be read by a subsequent call of scanf(). Since your code loops around and calls scanf() again, that will happen forever. The user will see the program continually prompting for arguments and then a message that not all numbers were assigned, over and over, without having a chance to enter more input.

You've be better off reading a complete line of characters of input from the user, before attempting to do anything with it. For example, use fgets() to read a line of characters, then sscanf() [note the additional 's' in the name] to interpret the string. The advantage of this is that, if a line is invalid, you have a choice of either discarding the whole line or just discarding the bad bits. This differs from your current code, which doesn't cope well with bad input midway through a line. Bear in mind that fgets() can cope with excessive input, but you'll need to write code to deal with it.

Originally Posted by thay2302
I'm also aware that "while (1>0)" isn't good programming form but I'm not really sure what to use instead?
You just want an infinite loop. A "while (1)" will suffice. A more common technique is "for(;;)" since, in a for loop, the continuation condition (between first and second semi-colon) is always true if not specified.

Personally, I wouldn't do it that way either, and I'd try to eliminate the continue and break statements (which are glorified "goto" statements so, like goto, make flow of execution harder to understand).

Noting the concerns with using scanf() - which you ALSO need to address, but I won't in this example - I'd do something like
Code:
```   int bad_input = 1;   /*  assume input is bad until we are convinced it is not */
do
{
printf("Please enter arguments in the order: negative mass, positive mass, initial x-position, initial y-position, initial x-velocity, initial y-velocity:\n");

if ( scanf("%lf %lf %lf %lf %lf %lf",&MassMinus,&MassPlus,&Pos[0][0],&Pos[0][1],&Vel[0][0],&Vel[0][1]) != 6)
{
printf("Not all numbers were assigned!\n");
}
else if (MassMinus < 0 || MassPlus < 0)
{
printf("Masses must be positive!\n");
}
else if ( (Pos[0][0]-1)*(Pos[0][0]-1)+Pos[0][1]*Pos[0][1] < Error*Error)
{
printf("Initial position is too close to the positive mass and will collide instantly!\n");
}
else if ( (Pos[0][0]+1)*(Pos[0][0]+1)+Pos[0][1]*Pos[0][1] < Error*Error)
{
printf("Initial position is too close to the negative mass and will collide instantly!\n");
}
else
{
bad_input = 0;     /*  if we reach here, we're happy with the input */
}
This is a reversal of philosophy. Your code is assuming the input is good, until you find evidence it is bad. My approach assumes input is bad, until I have evidence it is good - every possible case of bad data is checked first. My approach is also more easily extended with additional checks for bad data (just add another "else if" test/block).

3. Thanks for your help, most of that makes a lot of sense to me and has really helped me.

However, I'm struggling to use the fgets() function. I don't understand exactly what you mean when you say "let sscanf interpret the string"? How do I make sscanf interpret the string? Also where exactly within the do loop would I put the fgets() and sscanf() functions? I'm a bit lost with how this is all supposed to work.

4. Use some combination of fgets() and sscanf() to replace the scanf() call. This might involve a number of lines of code (i.e. one scanf() statement might be replaced by a set of statements that call fgets(), check if it succeeded, calls sscanf() [only if the fgets() succeeded], etc).

Look up the documentation for sscanf(). The first argument is an array of characters to be interpreted. The following arguments are as for scanf() - format string, other arguments. The only difference is that sscanf() takes input from an array of characters, fscanf() takes input from a file, and scanf() takes input from stdin (a particular device, which might or might not be a file).

The reason for using fgets()/sscanf() is that it is easier to cope with errors, because you're able to look at the input in any way you like, in any order you like. It is more difficult to do that with scanf(), because scanf() handles input sequentially, which makes it harder to recover if the user types a mix of good and bad data. And users, pesky and contrary entities that they are, cannot be forced to enter good input.

5. Okay well I've now got this which just appears to make the program wait and I can't work out why. Am I using the functions correctly?

Code:
```    bad_input = 1;

do
{
printf("Please enter arguments in the order: negative mass, positive mass, initial x-position, initial y-position, initial x-velocity, initial y-velocity:\n");

if ( fgets(str,60,stdin) != NULL )
{
if ( sscanf(str,"%lf %lf %lf %lf %lf %lf",&MassMinus,&MassPlus,&Pos[0][0],&Pos[0][1],&Vel[0][0],&Vel[0][1]) != 6)
{
printf("Numbers were not entered correctly!\n");
}

if ( scanf("%lf %lf %lf %lf %lf %lf",&MassMinus,&MassPlus,&Pos[0][0],&Pos[0][1],&Vel[0][0],&Vel[0][1]) != 6)
{
printf("Not all numbers were assigned!\n");
}
else if (MassMinus < 0 || MassPlus < 0)
{
printf("Masses must be positive!\n");
}
else if ( (Pos[0][0]-1)*(Pos[0][0]-1)+Pos[0][1]*Pos[0][1] < Error*Error)
{
printf("Initial position is too close to the positive mass and will collide instantly!\n");
}
else if ( (Pos[0][0]+1)*(Pos[0][0]+1)+Pos[0][1]*Pos[0][1] < Error*Error)
{
printf("Initial position is too close to the negative mass and will collide instantly!\n");
}
else
{
bad_input = 0;     /*  if we reach here, we're happy with the input */
}

}

6. scanf is NOT sscanf!

Code:
`if ( scanf("%lf %lf %lf %lf %lf %lf",&MassMinus,&MassPlus,&Pos[0][0],&Pos[0][1],&Vel[0][0],&Vel[0][1]) != 6)`
Tim S.

7. I said REPLACE the scanf() call. Not do it as well as something else.

And it is not necessary to nest the if's.

8. So is this now correct?

Code:
```    bad_input = 1;
do
{
printf("Please enter arguments in the order: negative mass, positive mass, initial x-position, initial y-position, initial x-velocity, initial y-velocity:\n");

if ( fgets(str,90,stdin) == NULL )
{
printf("Numbers were not entered correctly! Please try again.\n");
}
else if ( sscanf(str,"%lf %lf %lf %lf %lf %lf",&MassMinus,&MassPlus,&Pos[0][0],&Pos[0][1],&Vel[0][0],&Vel[0][1]) != 6)
{
printf("Numbers were not entered correctly! Please enter 6 real numbers.\n");
}
else if (MassMinus < 0 || MassPlus < 0)
{
printf("Masses must be positive!\n");
}
else if ( (Pos[0][0]-1)*(Pos[0][0]-1)+Pos[0][1]*Pos[0][1] < Error*Error)
{
printf("Initial position is too close to the positive mass and will collide instantly!\n");
}
else if ( (Pos[0][0]+1)*(Pos[0][0]+1)+Pos[0][1]*Pos[0][1] < Error*Error)
{
printf("Initial position is too close to the negative mass and will collide instantly!\n");
}
else
{
bad_input = 0;     /*  if we reach here, we're happy with the input */
}

9. At the risk of being cryptic, it's not necessarily incorrect. Your new code is certainly much more robust to errors than your original code. You need to decide if it is robust enough.

To characterise a bit more precisely, if str is an array of 90 (or more) characters (you haven't shown that) and the user doesn't type more than 90 characters on a line, your code will cope with a number of scenarios where the user provides bad input. The code might emit a multiple lines of error message for a single line of input that contains bad data, and it might sometimes discard data the user (who may be irrational enough to expect your program to cope with bad data but still not discard all the good data) thinks shouldn't be discarded. Whether those are good or bad things (i.e. correct or not) depends on your requirements.

I'm going to point out a couple of ways in which your code might behave in ways you haven't intended.

Consider what would happen if the user enters "1 2 3 4 5 6" and then holds down the 'a' key (so a string of 100 or a's are typed) before hitting newline. Your original code would have looped forever with that input. You latest code will survive that (albeit it would loop a couple of times producing error messages) but would also discard the data "1 2 3 4 5 6". Is that what you intend? Is it acceptable.

What would happen if the declaration of str (which you haven't shown) was changed so str becomes an array of 20 characters, but you forgot to change the "90" in the fgets() call to "20"? The answer is a buffer overrun. The solution, assuming str is an actual array (not a pointer) would be to test "if (fgets(str, sizeof str, stdin) == NULL)". If str is a pointer (for example, an array is passed to your code as a function argument) the size of the array also needs to be supplied to fgets(). Supplying a value of 90 when the actual size is 20 gives undefined behaviour.

What do you want to do if the user enters 3 values on one line, enters newline, then enters 3 more values? The code, as it stands, will discard all six values. That is not necessarily wrong (after all, it is a result if the user has entered data that isn't in the required format). But is it acceptable for your program?

Where I'm going with all this is that you need to decide how far you go with data validation. You need to consider requirements for coping with bad input (what types of bad input does your code need to tolerate?), requirements for user interaction (if a user enters bad data, is it okay to emit a page of error messages, or is only one error message acceptable?), requirements for data integrity (if your program discards data, how does that affect what it does with subsequent data?) and requirements for maintainability (what happens if you need to change a buffer size because your program has to run in an environment short on memory, but you don't want to introduce inadvertent bugs?).

The more stringent your requirements are, the more effort you need to put in - in terms of getting the right set of requirements, more code to meet those requirements, and more checks (test cases, etc) to see if the code meets all the requirements.

Coping with good data (correctly formatted, no extraneous characters, the user follows the rules, etc) is the easy part.

10. I think while it is not perfect, it is sufficient for the program I'm writing. Thank you for your help on input validation. This will certainly come in handy in the future when I write further programs.

If possible I'd like to ask about something you mentioned in one of your earlier posts. You said that if I enter more than six characters and then an "a", the "a" will be left pending by scanf(). Does this work similarly to fgets()?

To be more specific, I'm trying to write some code at the bottom of my program which will calculate some runtime statistics. At the moment I'm trying to do this by writing

Code:
```    printf ("Would you like a set of runtime statistics? y/n\n");
if ( fgets (str,90,stdin) == "y" )
{
// Write some runtime statistics
}```
However, this doesn't work. Do I need to "clear" the previous input somehow?

11. Originally Posted by thay2302
You said that if I enter more than six characters and then an "a", the "a" will be left pending by scanf().
I didn't quite say that, but no matter.
Originally Posted by thay2302
Does this work similarly to fgets()?
No.

fgets() extracts characters from the file it is reading (stdin, in your code) until it either reaches the end of the buffer you supply, or encounters a newline (which is stored in the buffer)

The behaviour of scanf() depends on the format string. But it works by looking ahead to the next character of pending input. If that next character is unexpected (for example, using %f format to read a float, and that next character wouldn't make sense in a floating point value) then it is left pending. The next call of scanf() will encounter the same character.

Originally Posted by thay2302
To be more specific, I'm trying to write some code at the bottom of my program which will calculate some runtime statistics. At the moment I'm trying to do this by writing

Code:
```    printf ("Would you like a set of runtime statistics? y/n\n");
if ( fgets (str,90,stdin) == "y" )
{
// Write some runtime statistics
}```
However, this doesn't work. Do I need to "clear" the previous input somehow?
Not necessarily. That depends on what your previous code was doing to stdin. If all of your previous code has been using fgets() consistently (not mixing fgets() with scanf(), for example) then you shouldn't need to.

However, the real reason your code doesn't work is because the equality operator (==) should not be used to compare arrays, even if such comparisons compile (as pointer comparisons). Look up the strcmp() function (which is declared in the standard header <string.h>.

The second reason your code won't work is because of the way fgets() works. Specifically, you need read the documentation on what fgets() does when it encounters a newline character. Let's assume - before this little bit of code - the last call of fgets() had encountered a newline. That means the 'y' will be the first character encountered by this call of fgets(), and the second will be a carriage return (with the value equal to '\n'). str will then have three characters written to it - the 'y', the '\n', and a terminating zero.

12. Okay great, I see my error with that now I think. I've gone for this instead now and it seems to be working.

Code:
`    if ( strcmp ("y\n",fgets (str,90,stdin)) == 0)`

Popular pages Recent additions