Simple spell checker

Printable View

Show 80 post(s) from this thread on one page

03-18-2008
purplechirin

Quote:

Originally Posted by Elysia

Deosn't work that way. You want the size of the array, not the length of the string, so you can't rely on strlen here.

I declared the array as char words[i][j]; (i and j are arbitrary numbers for me to test files that contain only a few lines - here I put char words[10][10];)

I tried using sizeof(words), but it gave me a value of 100..
03-18-2008
Elysia

To get the size of an array, you can usually do:
sizeof(array) / sizeof(array[0]);
03-18-2008
vart

Quote:

To get the size of an array, you can usually do:
sizeof(array) / sizeof(array[0]);

But you have to do it in the function where the array is declared.
In any other funtion - you will need to pass an array size as a parameter
03-18-2008
purplechirin

Thanks, I'll try to get the array size problem fixed.

But back to my other question, why would printf("%s, %s\n", token, words[i]); output "string1, string1", but when I use strcmp(token, words[i]) == 0 it won't give a true?
03-18-2008
matsp

print each of the words[i] characters as decimal or hex to see if there is any invisible characters that "get in the way".

--
Mats
03-18-2008
purplechirin

Quote:

Originally Posted by matsp

print each of the words[i] characters as decimal or hex to see if there is any invisible characters that "get in the way".

--
Mats

do you mind explaining how it's supposed to be done? i'm really lost at this. not typecasting is it?
03-18-2008
zacs7
Well since this won't help you get any marks (if this is an assignment)

Code:

void print_string_as_hex(const char * s) { size_t len = strlen(s), i = 0; for(i = 0; i < len; i++) { printf("%X", s[i]); } return; }

I didn't test it, hope it works :)

Read it and understand it before using it.

In my opinion you're jumping onto the 'code train' too early, design it first -- for example:
- open the dictionary file
- read each line (word), trim the newline character if nessisary -- adding the word to a linked list or array (sizing the array as nessisary with realloc() :))
- close the file
- do whatever with the array or linked list of words
03-18-2008
vart

Quote:

would printf("%s, %s\n", token, words[i]); output "string1, string1",

try something like

Code:

printf("\"%s\", \"%s\"\n", token, words[i]);
03-19-2008
purplechirin
Quote:
Originally Posted by zacs7

Well since this won't help you get any marks (if this is an assignment)

Code:

void print_string_as_hex(const char * s) { size_t len = strlen(s), i = 0; for(i = 0; i < len; i++) { printf("%X", s[i]); } return; }

I didn't test it, hope it works :)

Read it and understand it before using it.

In my opinion you're jumping onto the 'code train' too early, design it first -- for example:

open the dictionary file
read each line (word), trim the newline character if nessisary -- adding the word to a linked list or array (sizing the array as nessisary with realloc() :))
close the file
do whatever with the array or linked list of words
It is an assignment, but I'm not looking for the solution - i just need help with particular parts that I seem to have problems with. :)

About the design, I do have a sketch of the flowchart before I started coding, and I'm currently at the "do whatever with the array or linked list of words" - which is where I'm now stuck.

Just so no one gets the impression that I'm expected the entire solution here, here's what I've completed so far in the coding:
- check whether the command line arguments are valid; exit properly if not
- check if the provided dictionary filename exists
- check if the provided filename (of the file to be checked) exists
- open the dictionary
- read the dictionary words into an array
- close the dictionary file
- open the file to be checked
- read the each words in the file
- compare the words to the 'dictionary' array <<-- problem here
- print out the words that are not found in the dictionary
- close the file
I really appreciate all the help I get from you guys here, but yeah, I'm trying to do my own homework. :D
03-19-2008
purplechirin

Quote:

Originally Posted by vart

try something like

Code:

printf("\"%s\", \"%s\"\n", token, words[i]);

ehh.. i tried that and it printed out "string1", "string1 (without the last double quote). where did it go?? is that the problem? :confused:
03-19-2008
vart

Quote:

Originally Posted by purplechirin

ehh.. i tried that and it printed out "string1", "string1 (without the last double quote). where did it go?? is that the problem? :confused:

I suppose it goes on the next line indicating that the second string contains \n at the end
03-19-2008
purplechirin

Quote:

Originally Posted by vart

I suppose it goes on the next line indicating that the second string contains \n at the end

the double quote somewhat disappeared.. there's no extra " on the second line.

edit: i changed it a little, and instead of a double quote i put different symbols:

Code:

printf("!%s@, #%s$\n", token, words[i]);

and instead of printing out !string1@, #string2$, it came out:

Code:

$string1@, #string1 $string1@, #string2 $string1@, #string3 ...

:confused:
03-19-2008
matsp

That indicates that there is a '\r' [carriage return] at the end of the line - can you post:
1. Code that opens the "valid word list".
2. Code that reads the word.
3. Code that removes newline.

I suspect you are reading the file in binary mode, but it could be other things.

Using the "print in hex" variation will show you that it's got a 0D character at the end of the string, I suspect.

--
Mats
03-19-2008
purplechirin

Quote:

Originally Posted by matsp

That indicates that there is a '\r' [carriage return] at the end of the line - can you post:
1. Code that opens the "valid word list".
2. Code that reads the word.
3. Code that removes newline.

I suspect you are reading the file in binary mode, but it could be other things.

Using the "print in hex" variation will show you that it's got a 0D character at the end of the string, I suspect.

--
Mats

While i do the printing in hex.. here are the codes:

Code that opens the 'valid word list':

Code:

FILE *dict; dict = fopen(filename, "r");

Code that reads the word:

Code:

for (i=0; fgets(words[i], sizeof(words[i]), dict) != NULL; i++);

Code that removes newline:

Code:

/* remove the newline characters from each word */ for (i=0; i<sizeof(words)/sizeof(words[0]); i++) { char *p = strchr(words[i],'\n'); if(p) { *p = '\0'; } }

EDIT: I did the printing in hex for the words[], and 0D was at the end of the last element in the array -

6C696E6531D <-- word[0]
6C696E6532D <-- word[1]
6C696E6533D <-- word[2]
...
6C696E653130D <-- word[9]

(in my 'dictionary' of 10 words)
03-19-2008
matsp

Strange. Are you by any chance using a word-list generated by a Windows / DOS program on a Linux/Unix machine? That would explain the newline/carriage return problem.

If so, you should use "dos2unix wordlist" to make sure the newlines are converted from CR+LF to LF only.

--
Mats

Show 80 post(s) from this thread on one page