The way I'd solve this, is separate the parsing of each kernel spec (separate command-line argument in the argv[] array) to a helper function. I'd also use a separate structure for each kernel, just to make it easier:
Code:
#include <stdlib.h>
#include <string.h>
#include <stdio.h>
#include <errno.h>
struct kernel {
char *image;
double gaussian;
double sigma;
};
static void kernel_init(struct kernel *const k)
{
/* Set defaults: */
k->image = NULL;
k->gaussian = 0.0;
k->sigma = 0.0;
}
static void kernel_free(struct kernel *const k)
{
/* Free dynamically allocated strings: */
free(k->image);
/* Set "invalid" values: */
k->image = NULL;
k->gaussian = 0.0;
k->sigma = 0.0;
}
The static means the functions are only visible in the current file. I used it because the struct kernel is also only defined in this file. void means the functions do not return anything.
The kernel_init() function is used to initialize a kernel before use. This is useful, because then you only need to set sensible initial values in one place. After the kernel is no longer needed anywhere, you can use the kernel_free() function to destroy it.
It is not necessary to destroy kernels before the program exits, as the operating system will release the resources anyway when the program exists. However, if you use many kernels, and the program is long-lived, it is a good idea to become accustomed to initializing and freeing resources as they are needed; this is often a practice that many find harder to learn later on.
Next, let's declare a kernel string parser:
Code:
#define PARSED_IMAGE (1 << 0)
#define PARSED_GAUSSIAN (1 << 1)
#define PARSED_SIGMA (1 << 2)
#define PARSED_ALL (PARSED_IMAGE | PARSED_GAUSSIAN | PARSED_SIGMA)
static int parse_kernel(const char *s, struct kernel *const k);
The idea is that parse_kernel() takes a string (such as "image=image1.bmp; gaussian=4; sigma=1"), parses it into the structure pointed to by k, and returns which fields were parsed from the string. If an error occurs, I'd return 0 with errno set to indicate the reason.
To test the function, let's write a test main(), so we can test our hard-to-write function early and often:
Code:
int main(int argc, char *argv[])
{
int arg, parsed;
struct kernel k;
for (arg = 1; arg < argc; arg++) {
kernel_init(&k);
parsed = parse_kernel(argv[arg], &k);
if (!parsed) {
fprintf(stderr, "%s: Could not parse kernel: %s.\n", argv[arg], strerror(errno));
fflush(stderr);
} else {
printf("Kernel '%s':\n", argv[arg]);
if (parsed & PARSED_IMAGE)
printf("\tImage '%s'\n", k.image);
else
printf("\tImage was not supplied.\n");
if (parsed & PARSED_GAUSSIAN)
printf("\tGaussian %g\n", k.gaussian);
else
printf("\tGaussian was not specified.\n");
if (parsed & PARSED_SIGMA)
printf("\tSigma %g\n", k.sigma);
else
printf("\tSigma was not specified.\n");
}
kernel_free(&k);
}
return EXIT_SUCCESS;
}
There are several different methods on how the parse_kernel() function can handle its task: separating into tokens using strtok() and parsing each token using sscanf(), opportunistic parsing using sscanf() and %n to see how much of the string was consumed, or using strspn() and strcspn() for tokenization and strncmp() for matching names and sscanf() for converting values. Although the above link to the Linux man pages online, the referenced functions are portable (due to C89 listed in the "Conforming to" -sections!) and not at all Linux-specific.
Typically, coursework solutions use the tokenization approach. I don't like it myself, because it modifies the string in-place, and that's sometimes problematic. I like the opportunistic parsing, but it has a downside that it completely ignores whitespace, even between letters in the identifiers (image, gaussian, sigma); it'd accept both "image=foo.gif" and "i mage=foo.gif". Although the third option is probably the most complex, I'll show you that one; that way there is most opportunity for learning, and you're actually unlikely to just use it but write your own code instead.
Here, only the image field is a dynamically allocated string, but you often have several. To update those fields, I like to use a helper function. If the field has already been set, the helper function will free it before replacing it, so it won't leak any memory. (Consider what would otherwise happen with e.g. "image=a.png;image=b.png;...;image=z.png". This function takes a pointer to the string pointer, the source string pointer, and length (since most likely the source string is delimited and does not end where the string ends):
Code:
static int set_string(char **const dst, const char *const src, const size_t len)
{
/* We must be given a valid pointer to a string pointer. */
if (dst == NULL)
return errno = EINVAL;
/* If len is nonzero, we need a valid source string pointer, too. */
if (len > 0 && src == NULL)
return errno = EINVAL;
/* Free the old destination string, if any. */
if (*dst != NULL)
free(*dst);
/* Allocate memory for a new string. */
*dst = malloc(len + 1);
if (*dst == NULL)
return errno = ENOMEM; /* Failed, out of memory. */
/* Copy the source string, and terminate the string. */
if (len > 0)
memcpy(*dst, src, len);
(*dst)[len] = '\0';
/* Done! */
return 0;
}
The above function returns 0 if successful, otherwise it returns nonzero with errno set to indicate the error (ENOMEM = not enough memory, or EINVAL = invalid parameters). But carefully note that it takes a pointer to the destination string pointer, because it needs to change it!
The parser function I'd use is as follows:
Code:
static int parse_kernel(const char *s, struct kernel *const k)
{
int parsed = 0;
double value;
int len;
if (s == NULL || k == NULL) {
/* Fail due to invalid parameters. */
errno = EINVAL;
return 0;
}
while (1) {
/* Skip separator characters. */
s += strspn(s, "\t\n\v\f\r ;");
/* End of string? */
if (*s == '\0')
break;
if (!strncmp(s, "image=", 6)) {
/* image=... */
s += 6;
/* find separator (or end of string, if no separator) */
len = strcspn(s, ";");
/* Set k->image to (a copy of) len chars starting at s. */
if (set_string(&(k->image), s, len))
return 0; /* Error occurred, errno already set. */
/* Mark 'image' field updated, and advance to next part. */
parsed |= PARSED_IMAGE;
s += len;
continue;
}
if (!strncmp(s, "gaussian=", 9)) {
/* gaussian=... */
s += 9;
/* Attempt to parse the value as a number. */
len = -1;
(void)sscanf(s, " %lf %n", &value, &len);
if (len > 0) {
/* Save value. */
k->gaussian = value;
/* Mark 'gaussian' field updated, and advance to next part. */
parsed |= PARSED_GAUSSIAN;
s += len;
continue;
}
}
if (!strncmp(s, "sigma=", 6)) {
/* sigma=... */
s += 6;
/* Attempt to parse the value as a number. */
len = -1;
(void)sscanf(s, " %lf %n", &value, &len);
if (len > 0) {
/* Save value, and mark 'sigma' field updated. */
k->sigma = value;
/* Mark 'sigma' field updated, and advance to next part. */
parsed |= PARSED_SIGMA;
s += len;
continue;
}
}
/* Unknown field. For future compatibility, we skip
* but also warn about such unknown fields. */
/* First, find the length of the name, */
len = strcspn(s, "=;");
/* and write the warning to standard error. */
if (len > 0) {
fwrite(s, len, 1, stderr);
fprintf(stderr, ": Ignoring unknown kernel identifier.\n");
fflush(stderr);
}
/* Find the length of the ignored field, and skip to next part. */
len = strcspn(s, ";");
if (len > 0) {
s += len;
continue;
}
/* Parsing error. We parsed nothing, but can't skip anything either,
* and we're not at the end of the string, either. */
errno = EBADMSG;
return 0;
}
/* We set errno to 0 (no error, OK) in case parsed == 0. */
errno = 0;
return parsed;
}
The logic is that in the infinite loop, we first skip any separators (semicolons) and whitespace characters that are allowed before a field name. If after that we are at the end of the string, the loop is done.
In the loop body, we use the strncmp(s, "IDENTIFIER=", 11) to check if s starts with the 11-character prefix string "IDENTIFIER=". If it does, the function returns zero. You can read if (!strncmp()) ... as "if strncmp() returns zero, then ..."; i.e. the if body is only executed if the string does start with the desired prefix.
If the prefix matches, the matching part is skipped with s += 11;. I could have written a helper function to hide these details, but I think it is useful in learning this stuff.
After the prefix has been found, we know we have the desired value starting at s. We can also use strcspn(s, ";") to count the number of characters till the next semicolon (not including that semicolon itself), or till the end of the string if there is no semicolons in the string. I use this to find out the image name length, supplied to set_string() helper above.
For parsing numbers, I like to use the idiom
Code:
len = -1;
(void)sscanf(string, "... %n", ... , &len);
if (len >= 0) {
/* Parsed successfully till string+len */
}
It is a bit strange-looking pattern, because the C standards failed to clearly define whether a successful "%n" is counted in the result or not, and it depends on the C library used. However, all that matters is that if len becomes nonnegative, the complete pattern was parsed without issues, and you can just add it to the string pointer to skip over that part.
Finally, the latter half of the function body deals with unknown field names -- i.e. anything besides "image", "gaussian", or "sigma". In this case, I've made it easy to add new fields in the future. Older versions of the function will just ignore them, but warn, to standard error, so that the user knows something was ignored.
Questions?