Originally Posted by
phantomotap
If you are going to try to recover gracefully in the face of "user error" nothing shown so far is sufficient.
The suggestion by Adak causes a buffer underrun bug. It reads a byte just before the buffer, and if that byte equals 10, it replaces it with a zero.
My suggestion does not have that bug. I would also be happy with
Code:
int len = strlen(line);
if (len > 0 && line[len - 1] == '\n')
line[--len] = '\0';
which, due to the short circuit evaluation rules in C, does not try to access the byte just before the buffer, either. And the len ends up with the correct value in all cases (even when the line was longer than the available buffer and was only partially read).
Originally Posted by
phantomotap
The `line[strlen(line)]' character will never be a newline character unless something goes horribly wrong.
Right. It will always be a NUL character. It is safe to replace it with another NUL character.
(.. assuming line != NULL. Which should be checked right after the fgets() call anyway.)
Originally Posted by
phantomotap
All joking aside, the code forwarded by you will not get rid of the newline in the "\0bad input\n" form of string so it is just as likely to cause a serious problem at a later point.
My code does not try to access the char just before the buffer, though.
Given that input, my code yields *line == '\0' (and strlen(line) == 0 ), i.e. and empty string, but nothing bad happens.
To be honest, I did first write the proper solution, using POSIX.1-2008 getline(), but I thought it would either scare the OP, or I'd just get some negative feedback about it, so I rewrote my reply before posting it.
Code:
char *data = NULL;
size_t size = 0;
char *line;
size_t len;
ssize_t full;
/* Some form of input loop .. */
while (1) {
full = getline(&data, &size, file handle);
if (full < (ssize_t)1)
end of file or error, abort/break
/* Input sanitization */
if (!clean(data, (size_t)full, &line, &len))
input is suspect, ignore/warn/abort
/* Ignore empty lines (and comments, if cleaned) */
if (len < 1)
continue;
/* Now have len chars at line. */
}
free(data);
data = NULL;
size = 0;
The input sanitization function,
Code:
int clean(char *const input, const size_t inputlen, char **const result, size_t *const resultlen);
takes the size bytes of input in input, removes unwanted characters or at least checks the string, saves the start of the contents and length of the contents in the last two pointers supplied by the user unless NULL, and returns 0 if the input was safe, and nonzero otherwise.
The implementation details are very domain-specific. You might want to simply remove all embedded NUL bytes and control characters, and replace consecutive whitespace with a single space; this works well with semi-interactive human input, say a game. In other cases you might wish to replace NULs and (some) other ASCII control characters with escape codes (for example, ASCII NULs with "\\0" or ("&" "#0;"), the numeric HTML entity reference for ASCII NUL character).
If there is interest, I'd be happy to show a couple of practical examples.