# Thread: Logic help with strings and arrays.

1. ## Logic help with strings and arrays.

Hi everyone.

Im trying to write a code using scanf to take in any chunk of text, and i want to perform a word frequency count. Now from what i understand i should copy my string into an array, the problem is, what if i write a sentence, or then i write a paragraph... hows my array meant to adjust? So if i set myself a limit, like i know the MAXIMUM ill have is 50 unique words, the approach i think of goes like this.

Having two separate arrays, wherein one stores the word if its new, but to do that i need the second array which copies the word in at the same time as the first array, and if the word is a repeat it increments the frequency of the word but doesn't let it get copied into the first array.
The thing is im thinking then my second array is not only storing the word, but the frequency as well, which is exactly what i want, but how can i display both the characters and integers in one array ?!

Furthermore, i don't see how i can copy the string across word at a time unless its already IN an array.

P.S I've looked into 2D arrays,but how will that help?

2. one thing i want to tell you if u take the data using scanf then it will not take the data after space
ex "my text" so it will take only "my"
Second thing is use simply char *strArray;
and in this strArray u will store unwanted number of chars(means string).

3. First, let me say that this information is going to stored in a simple array. If you were taklking
about a large amount of data a binary tree (tsearch) would offer much better performance at
the cost of a little more complexity.

What you need to do is to create a structure that contains both a charactor array and an
integer like so:

typedef struct {
char *token;
int ct;
} array_element;

The first thing you have to understand that in any situation where the amount of data is
unknown you are going to have to allocate it as needed. So, lets start to build the app.
I'm going to assume that this program simply scans tokens (words) of the standard input. So

Code:
```
typedef struct {
char *word;
int word_ct;
} ARRAY_ELEMENT;

#define TRUE 1
#define FALSE 0

/* Notice how this type def occurs outside of main. This is to allow it to be visable to all the functions */

main()

{

ARRAY_ELEMENT *a; /* The initial pointer to the array of ARRAY_ELEMENTS */
int a_ct; /* a counter to keep track of how many elements are in the array */
char token[1024]; /* This is the buffer that receives the tokens scanned off standard input */
int ct;

/* Initialize the array */
a = ((ARRAY_ELEMENT *) (calloc(1,sizeof(ARRAY_ELEMENT))));
a_ct = -1; /* I always start my arrays with -1 - it means that the array has no elements */

while(scanf("%s",token) != EOF){ /* The input loop - If you hand it a token longer than 1024 charactors the program will crash */

/* The problem is simple.
If the token is not in the array add it, otherwise increment the counter */

/* Step one - check to see if the word is already in the array I am going to do this with a function - fcheck()
This function will have three arguments. The ARRAY_ELEMENT base (*a), the array element count (a_ct), and the scanned in token */

/* If the word is not in the array, we have to add it.
This requires allocating memory in two ways.
First we have to increase the size of the array by one to accomodate the new word.
Second we have to allocate space for the word itself. */

if(fcheck(a,a_ct,token) == FALSE){

++ a_ct; /* Increase the number of array elements */
a = ((ARRAY_ELEMENT *) (realloc(a,(a_ct + 1) * sizeof(ARRAY_ELEMENT)))); /* This increases the memory allocated or pointed to by "a" */
a[a_ct].word = ((char *) (calloc(strlen(token) + 1,sizeof(char))));
/*
This allocates memory for the actual word.
Notice how the size is the length of the token plus 1.
This is to allow for the terninating NULL of the string.
*/
a[a_ct].word_ct = 1; /* This word has be seen once at this point */

}

}

/* At this point you have your completed array. To view it you could do */

for(ct = 0; ct <= a_ct; ++ ct)
printf("%s appeared %d times\n",a[ct].word,a[ct].word_ct);

}

int fcheck(a,a_ct,token)

ARRAY_ELEMENT *a;
int a_ct;
char token;

{

int ct;

/*
Loop through the array as it currently exists.
Notice by initiallizing my array count to -1 the first time this is called
it will return FALSE because the loop never executes.
If the token already exists in the array the function increments the counter and returns TRUE
*/

for(ct = 0; ct <= a_ct; ++ ct){
if(strcmp(token,a[ct].word) == 0){
++ a[ct].word_ct;
return(TRUE);
}

return(FALSE);

}```