-
tokenizing...
hi, I'm trying to take in an input file, and then tokenizing each word in the file. The user then types in a word, let's say "the", and the output is supposed to be the indexes of the occurance of the word. I'm on the UNIX platform.
eg. "this is the input of the file"
output: the
10 23
this
1
.....etc etc.
numbers, punctuation, and whitespaces are ignored.
my problem is that the program is permanently reading from the same input file. If I try another input file the output would still be from the old file. eg. first input file has 2 "the"s. 2nd file has no "the"s. 2nd output would still give me the indexes of "the"s from the 1st file.
Also, after searching for the first word, the pointer pch is "stuck" on the last token. I know it has something to do with the strtok function I'm using, but not sure exactly what the problem is.
I haven't gotten to the part where I print out the indexes yet. (How would you do that anyway? The way my code is written I cna't even think of a way to keep track...HELP!!!)
Please help. :)
Please forgive if the code is too messy.
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
#include <iostream.h>
#include <fstream.h>
#include <ctype.h>
void growArray(char *array);
int main(int argc, char* argv[])
{
const int MAX_SIZE = 1000;
char ch;
char str[MAX_SIZE];
char userInput[MAX_SIZE];
for(int i=0; i<MAX_SIZE; i++){
str[i]='\0';
userInput[i]='\0';
}
if(argc<1){
cerr << "USAGE: ./indexer file1" << endl;
exit(1);
}
int counter = 0;
for (int i = 1; i<argc; i++) {
ifstream istrm(argv[i]);
while (istrm.get(ch)){
if(counter>=MAX_SIZE)
growArray(str);
str[counter]=ch;
counter++;
}
}
//cin.getline(userInput, MAX_SIZE);
char * pch;
pch = strtok(str," \n\t\"\'\\,.");
do{
while (pch != NULL){
if(isspace(*pch) || ispunct(*pch)){
pch = strtok(NULL, " \n\t\"\'\\,.");
}
else{
if(strcmp(pch, userInput) == 0)
cout << "Found one!" << endl;
pch = strtok(NULL, " \n\t\"\'\\,.");
//It seems that the pch pointer never goes back to the beginning of str...
}
}
pch = strtok(str," \n\t\"\'\\,.");
}while(cin.getline(userInput, MAX_SIZE));
return 0;
}
/************************************************** ********************/
/*THESE ARE MY FUNCTIONS */
/************************************************** ********************/
void growArray(char *array)
{
/*create a dynamic array that is bigger than passed in array; copies everything from passed in array to dynamic array; the effect is that the passed in array "grew"*/
int size = sizeof array / sizeof array[0];
char *pTemp = new char[size + 20];
//transfer all of the object in the array to the objects in the temporary array
for(int nCount = 0; nCount < size; nCount++)
pTemp[nCount] = array[nCount];
array = pTemp;
}
-
I've not looked at it all, but there is a least some errors in the growArray function.
When you use sizeof on a char* it will return the size of the pointer (4 on a PC). It will not return the size of the table!!! You must pass in the number of elements in the table.
Also, your function creates a new array, but the caller can not access it!!! You would have to return the new array (and delete the old one, except this won't work because it's an automatic array in your case...).
Also, why do you loop over argc. Why not just use argv[1]???
I think there's quite a few other problems (around strtok...). Correct the previous ones then let us know!!!