Thread: i dont know what to do to count all matching words

  1. #1
    Registered User
    Join Date
    Apr 2011
    Posts
    308

    i dont know what to do to count all matching words

    i have a bunch of string each in their own array element.

    i put each array element into its own new array, the new array holds each word of the string.

    i take the array holding individual words, and compare it to the array of strings with multiple words.

    i want to count the number of times the word appears in the text file, in such a way that i can identify the array element that holds a string with multiple is where the word was counted in.

    here is the text file article.txt;
    Code:
    The Lion and the Mouse
    
    Once when a lion, the king of the jungle, was asleep, a little mouse began running up and down on him. 
    This soon awakened the lion, who placed his huge paw on the mouse, and opened his big jaws to swallow him.
    
    "Pardon, O King!" cried the little mouse. 
    "Forgive me this time. 
    I shall never repeat it and I shall never forget your kindness. 
    And who knows, I may be able to do you a good turn one of these days!”
    
    The lion was so tickled by the idea of the mouse being able to help him that he lifted his paw and let him go.
    
    Sometime later, a few hunters captured the lion, and tied him to a tree. 
    After that they went in search of a wagon, to take him to the zoo.
    
    Just then the little mouse happened to pass by. 
    On seeing the lion’s plight, he ran up to him and gnawed away the ropes that bound him, the king of the jungle.
    
    "Was I not right?" said the little mouse, very happy to help the lion.
    
    MORAL: Small acts of kindness will be rewarded greatly.
    here is my code so far;

    Code:
    using System;
    using System.Collections.Generic;
    using System.IO;
    using System.Linq;
    using System.Text;
    using System.Text.RegularExpressions;
    using System.Threading.Tasks;
    
    namespace project
    {
        class Program
        {
            public void article_2_step_2(string[] sentence_words, int counter)
            {
                string record_sentence = "";
                int counter_2 = 0;
    
                StreamReader article = new StreamReader("word_list.txt");
    
                while ((record_sentence = article.ReadLine()) != null)
                {
                    if (record_sentence == "") break;
                    sentence_words[counter_2] = record_sentence;
                    counter_2++;
                }
                article.Close();
            }
    
            public void article_2_step_1(ref int counter)
            {
                string record_sentence = "";
    
                StreamReader article = new StreamReader("word_list.txt");
    
                while ((record_sentence = article.ReadLine()) != null)
                {
                    if (record_sentence == "") break;
    
                    counter++;
                }
                article.Close();
            }
    
            public void write_3(string some_word, string new_number, int counter)
            {
                int article_counter = 0;
                article_2_step_1(ref article_counter);
                string[] article_array = new string[article_counter];
                article_2_step_2(article_array, article_counter); // the lines in word_list.txt
    
                File.WriteAllText("word_list.txt", String.Empty);
    
                StreamWriter solution_1 = new StreamWriter("word_list.txt", true);
    
                for (int i = 0; i < article_counter; ++i)
                {
                    if(i == counter)
                    {
                        solution_1.WriteLine(some_word + " " + new_number);
                    }
                    else
                    {
                        solution_1.WriteLine(article_array[i]);
                    }
                }
    
                solution_1.Close();
            }
    
            public void compare_3(string some_word)
            {
                string record_sentence = "";
                string Text_3 = "";
                int counter = 0;
                StreamReader record = new StreamReader("word_list.txt");
    
                while ((record_sentence = record.ReadLine()) != null)
                {
                    if (record_sentence == null) continue;
                    if (record_sentence == "") continue;
                    string Text = record_sentence.Substring(0, record_sentence.IndexOf(" "));
                    if (Text == some_word)
                    {
                        string Text_2 = record_sentence.Split(' ')[1];
                        int new_value = Convert.ToInt32(Text_2);
                        new_value++;
                        Text_3 = Convert.ToString(new_value);
    
                        break;
                    }
                    counter++;
                }
                record.Close();
                write_3(some_word, Text_3, counter);
            }
    
            public void compare_2(string some_word, ref int found)
            {
                string record_sentence = "";
                
                StreamReader record = new StreamReader("word_list.txt");
    
                while ((record_sentence = record.ReadLine()) != null)
                {
                    if (record_sentence == null) continue;
                    if (record_sentence == "") continue;
                    string Text = record_sentence.Substring(0, record_sentence.IndexOf(" "));
                    if (Text == some_word) found = 1;
                }
                record.Close();
            }
            public void write_2(string some_word)
            {
                int found = 0;
    
                compare_2(some_word, ref found);
    
                if(found == 0)
                {
                    StreamWriter solution_1 = new StreamWriter("word_list.txt", true);
    
                    solution_1.WriteLine(some_word + " 1");
    
                    solution_1.Close();
                }
                if(found == 1)
                {
                    compare_3(some_word);
                }
            }
    
            public void write_1(int counter)
            {
                StreamWriter solution_1 = new StreamWriter("paragraph_match_count.txt", true);
    
                solution_1.WriteLine(counter);
    
                solution_1.Close();
            }
    
            public void compare(string [] long_line, int counter)
            {
                int counter_2 = 0;
                string temp = "";
                StringBuilder s = new StringBuilder();
    
                for (int i = 0; i < counter; i++) // go through all paragraphs
                {
                    counter_2 = 0;
                    string[] compare_1 = long_line[i].Split(' '); // putting each sentence word into an array element
    
                    for (int j = 0; j < compare_1.Length; j++) // go through entire paragraph i
                    {
                        for (int k = 0; k < counter; k++) // go through all words in paragraph
                        {
                            if (i == k) break;
    
                            s.Append("\\b");
                            s.Append(compare_1[j]);
                            s.Append("\\b");
                            temp = s.ToString();
                            s.Clear();
                            bool result = Regex.IsMatch(long_line[k], temp);
                            if (result)
                            {
                                //Console.WriteLine(long_line[i] + "\r\n:\r\n");
                                //Console.WriteLine(long_line[j] + "\r\n:\r\n");
                                //Console.WriteLine(compare_1[k] + "\r\n/////\r\n");
                                write_2(compare_1[j]);
                                counter_2++;
                            }
                        }
    
    
                        //string[] compare_1 = long_line[i].Split(' '); // putting each sentence word into an array element
    
                        //for (int k = 0; k < compare_1.Length; k++) // go through entire paragraph i
                        //{
                        //    if (long_line[i].Contains(compare_1[k]))
                        //    {
                        //        //Console.WriteLine(long_line[i] + "\r\n:\r\n");
                        //        //Console.WriteLine(long_line[j] + "\r\n:\r\n");
                        //        //Console.WriteLine(compare_1[k] + "\r\n/////\r\n");
                        //        write_2(compare_1[k]);
                        //        counter_2++;
                        //    }
                        //}
                    }
                    write_1(counter_2);
                }
            }
    
            static public string RemoveDuplicateWords(string v)
            {
                // 1
                // Keep track of words found in this Dictionary.
                var d = new Dictionary<string, bool>();
    
                // 2
                // Build up string into this StringBuilder.
                StringBuilder b = new StringBuilder();
    
                // 3
                // Split the input and handle spaces and punctuation.
                string[] a = v.Split(new char[] { ' ', ',', ';', '.' },
                    StringSplitOptions.RemoveEmptyEntries);
    
                // 4
                // Loop over each word
                foreach (string current in a)
                {
                    // 5
                    // Lowercase each word
                    string lower = current.ToLower();
    
                    // 6
                    // If we haven't already encountered the word,
                    // append it to the result.
                    if (!d.ContainsKey(lower))
                    {
                        b.Append(current).Append(' ');
                        d.Add(lower, true);
                    }
                }
                // 7
                // Return the duplicate words removed
                return b.ToString().Trim();
            }
    
            public void remove_Unwanted_chars(ref string text)
            {
                //Console.WriteLine("part_10 : step_3");
    
                // remove these chars
    
                // numbers
                text = text.Replace("0", "");
                text = text.Replace("1", "");
                text = text.Replace("2", "");
                text = text.Replace("3", "");
                text = text.Replace("4", "");
                text = text.Replace("5", "");
                text = text.Replace("6", "");
                text = text.Replace("7", "");
                text = text.Replace("8", "");
                text = text.Replace("9", "");
                // brackets
                text = text.Replace("(", "");
                text = text.Replace(")", "");
    
                text = text.Replace("{", "");
                text = text.Replace("}", "");
    
                text = text.Replace("[", "");
                text = text.Replace("]", "");
    
                text = text.Replace("<", "");
                text = text.Replace(">", "");
    
                // slash and pipe
    
                text = text.Replace("/", "");
                text = text.Replace("\\", "");
                text = text.Replace("|", "");
    
                // commas quotes periods 
    
                text = text.Replace("\"", "");
                text = text.Replace("\'", "");
    
                text = text.Replace(".", "");
                text = text.Replace(",", "");
    
                text = text.Replace("!", "");
                text = text.Replace("?", "");
    
                text = text.Replace(";", "");
                text = text.Replace(":", "");
    
                text = text.Replace(",", "");
                text = text.Replace("’", "");
    
                text = text.Replace("“", "");
                text = text.Replace("”", "");
    
                text = text.Replace("`", "");
    
                // symbols
    
                text = text.Replace("~", "");
                text = text.Replace("!", "");
                text = text.Replace("@", "");
                text = text.Replace("#", "");
                text = text.Replace("$", "");
                text = text.Replace("%", "");
                text = text.Replace("^", "");
                text = text.Replace("&", "");
                text = text.Replace("*", "");
                text = text.Replace("-", "");
                text = text.Replace("_", "");
                text = text.Replace("+", "");
                text = text.Replace("=", "");
    
                // remove double space
    
                while (text.IndexOf("  ") != -1)
                {
                    text = text.Replace("  ", " ");
                }
            }
    
            public void make_long_line(string[] long_line)
            {
                string record_sentence = "";
                string temp = "";
                string lower = "";
                int counter = 0;
    
                StringBuilder s = new StringBuilder();
                StreamReader record = new StreamReader("article.txt");
    
                while ((record_sentence = record.ReadLine()) != null)
                {
                    if (record_sentence == "")
                    {
                        temp = s.ToString();
                        remove_Unwanted_chars(ref temp);
                        lower = temp.ToLower();
                        lower = RemoveDuplicateWords(lower);
                        long_line[counter] = lower;
                        //Console.WriteLine(long_line[counter] + "\r\n");
    
                        s.Clear();
                        counter++;
                    }
                    else
                    {
                        s.Append(record_sentence);
                        s.Append(" ");
                    }
                }
                record.Close();
    
                temp = s.ToString();
                remove_Unwanted_chars(ref temp);
                lower = temp.ToLower();
                lower = RemoveDuplicateWords(lower);
                long_line[counter] = lower;
                //Console.WriteLine(long_line[counter] + "\r\n");
    
                s.Clear();
            }
    
            public void count_paragraphs(ref int counter)
            {
                string record_sentence = "";
                StreamReader record = new StreamReader("article.txt");
    
                while ((record_sentence = record.ReadLine()) != null)
                {
                    if (record_sentence == "")
                    {
                        counter++;
                    }
                }
                record.Close();
    
                counter++;
            }
    
            static void Main(string[] args)
            {
                File.WriteAllText("paragraph_match_count.txt", String.Empty);
                File.WriteAllText("word_list.txt", String.Empty);
    
                Program paragraphs = new Program();
    
                int counter = 0;
                paragraphs.count_paragraphs(ref counter);
                string[] long_line = new string[counter];
    
                paragraphs.make_long_line(long_line);
    
                paragraphs.compare(long_line, counter);
            }
        }
    }
    the problem function is "public void compare(string [] long_line, int counter)". im lost as to how to get the proper count of the word "the" right now i count it 21 times but its in the text file 22 times.

  2. #2
    Registered User
    Join Date
    Apr 2011
    Posts
    308
    i found out what i was doing wrong. i shortened the string by removing duplicate words, then i counted the original longer version and it didnt match up with what should have been counted.

    so with the shortened version i had to debug by using the console.writeline trick to find what was going on and i fixed my code so i dont need your halp anymore.

    here is the shortened version of the article;

    Code:
    the lion and mouse
    once when a lion the king of jungle was asleep little mouse began running up and down on him this soon awakened who placed his huge paw opened big jaws to swallow
    pardon o king cried the little mouse forgive me this time i shall never repeat it and forget your kindness who knows may be able to do you a good turn one of these days
    the lion was so tickled by idea of mouse being able to help him that he lifted his paw and let go
    sometime later a few hunters captured the lion and tied him to tree after that they went in search of wagon take zoo
    just then the little mouse happened to pass by on seeing lions plight he ran up him and gnawed away ropes that bound king of jungle
    was i not right said the little mouse very happy to help lion
    moral small acts of kindness will be rewarded greatly
    here is my results now;

    Code:
    the 7
    lion 6
    and 6
    mouse 6
    once 1
    when 1
    a 8
    pardon 1
    o 8
    king 3
    cried 1
    sometime 1
    later 1
    just 1
    then 1
    was 3
    i 8
    not 1
    right 1
    said 1
    moral 1
    small 1
    acts 1
    of 6
    kindness 2
    will 1
    be 4
    rewarded 1
    greatly 1
    and here is my working code;

    Code:
    using System;
    using System.Collections.Generic;
    using System.IO;
    using System.Linq;
    using System.Text;
    using System.Text.RegularExpressions;
    using System.Threading.Tasks;
    
    namespace project
    {
        class Program
        {
            public void write_4(string shortened_version)
            {
                StreamWriter solution_1 = new StreamWriter("shortened_article.txt", true);
    
                solution_1.WriteLine(shortened_version);
    
                solution_1.Close();
            }
    
            public void article_2_step_2(string[] sentence_words, int counter)
            {
                string record_sentence = "";
                int counter_2 = 0;
    
                StreamReader article = new StreamReader("word_list.txt");
    
                while ((record_sentence = article.ReadLine()) != null)
                {
                    if (record_sentence == "") break;
                    sentence_words[counter_2] = record_sentence;
                    counter_2++;
                }
                article.Close();
            }
    
            public void article_2_step_1(ref int counter)
            {
                string record_sentence = "";
    
                StreamReader article = new StreamReader("word_list.txt");
    
                while ((record_sentence = article.ReadLine()) != null)
                {
                    if (record_sentence == "") break;
    
                    counter++;
                }
                article.Close();
            }
    
            public void write_3(string some_word, string new_number, int counter)
            {
                int article_counter = 0;
                article_2_step_1(ref article_counter);
                string[] article_array = new string[article_counter];
                article_2_step_2(article_array, article_counter); // the lines in word_list.txt
    
                File.WriteAllText("word_list.txt", String.Empty);
    
                StreamWriter solution_1 = new StreamWriter("word_list.txt", true);
    
                for (int i = 0; i < article_counter; ++i)
                {
                    if(i == counter)
                    {
                        solution_1.WriteLine(some_word + " " + new_number);
                    }
                    else
                    {
                        solution_1.WriteLine(article_array[i]);
                    }
                }
    
                solution_1.Close();
            }
    
            public void compare_3(string some_word)
            {
                string record_sentence = "";
                string Text_3 = "";
                int counter = 0;
                StreamReader record = new StreamReader("word_list.txt");
    
                while ((record_sentence = record.ReadLine()) != null)
                {
                    if (record_sentence == null) continue;
                    if (record_sentence == "") continue;
                    string Text = record_sentence.Split(' ')[0];
                    if (Text == some_word)
                    {
                        string Text_2 = record_sentence.Split(' ')[1];
                        int new_value = Convert.ToInt32(Text_2);
                        new_value++;
                        Text_3 = Convert.ToString(new_value);
    
                        break;
                    }
                    counter++;
                }
                record.Close();
                write_3(some_word, Text_3, counter);
            }
    
            public void compare_2(string some_word, ref int found)
            {
                string record_sentence = "";
                
                StreamReader record = new StreamReader("word_list.txt");
    
                while ((record_sentence = record.ReadLine()) != null)
                {
                    if (record_sentence == null) continue;
                    if (record_sentence == "") continue;
    
                    string Text = record_sentence.Split(' ')[0];
                    if (Text == some_word) found = 1;
                }
                record.Close();
            }
            public void write_2(string some_word)
            {
                int found = 0;
    
                compare_2(some_word, ref found);
    
                if(found == 0)
                {
                    StreamWriter solution_1 = new StreamWriter("word_list.txt", true);
    
                    solution_1.WriteLine(some_word + " 1");
    
                    solution_1.Close();
                }
                if(found == 1)
                {
                    compare_3(some_word);
                }
            }
    
            public void write_1(int counter)
            {
                StreamWriter solution_1 = new StreamWriter("paragraph_match_count.txt", true);
    
                solution_1.WriteLine(counter);
    
                solution_1.Close();
            }
    
            public void compare(string [] long_line, int counter)
            {
                int counter_2 = 0;
                int counter_3 = 0;
                int found = 0;
                string temp = "";
                StringBuilder s = new StringBuilder();
    
                for (int i = 0; i < counter; i++) // go through all paragraphs
                {
                    counter_2 = 0;
                    counter_3 = 0;
                    found = 0;
    
                    string[] compare_1 = long_line[i].Split(' '); // putting each sentence word into an array element
    
                    for (int j = 0; j < compare_1.Length; j++) // go through entire paragraph i
                    {
                        compare_2(compare_1[counter_3], ref found); // each word is only counted through the file once
    
                        if(found == 0)
                        {
                            for (int k = 0; k < counter; k++) // go through all paragraphs
                            {
                                bool result = Regex.IsMatch(long_line[k], compare_1[counter_3]); // compare words in compare_1 array to each paragraph
                                if (result)
                                {
                                    write_2(compare_1[counter_3]); // increment the number of times the word appears in the article
                                    counter_2++; // increment the number of matches each line in the article has
                                }
    
                            }
                        }
                       
                        counter_3++;
                    }
                    write_1(counter_2);
                }
            }
    
            static public string RemoveDuplicateWords(string v)
            {
                // 1
                // Keep track of words found in this Dictionary.
                var d = new Dictionary<string, bool>();
    
                // 2
                // Build up string into this StringBuilder.
                StringBuilder b = new StringBuilder();
    
                // 3
                // Split the input and handle spaces and punctuation.
                string[] a = v.Split(new char[] { ' ', ',', ';', '.' },
                    StringSplitOptions.RemoveEmptyEntries);
    
                // 4
                // Loop over each word
                foreach (string current in a)
                {
                    // 5
                    // Lowercase each word
                    string lower = current.ToLower();
    
                    // 6
                    // If we haven't already encountered the word,
                    // append it to the result.
                    if (!d.ContainsKey(lower))
                    {
                        b.Append(current).Append(' ');
                        d.Add(lower, true);
                    }
                }
                // 7
                // Return the duplicate words removed
                return b.ToString().Trim();
            }
    
            public void remove_Unwanted_chars(ref string text)
            {
                //Console.WriteLine("part_10 : step_3");
    
                // remove these chars
    
                // numbers
                text = text.Replace("0", "");
                text = text.Replace("1", "");
                text = text.Replace("2", "");
                text = text.Replace("3", "");
                text = text.Replace("4", "");
                text = text.Replace("5", "");
                text = text.Replace("6", "");
                text = text.Replace("7", "");
                text = text.Replace("8", "");
                text = text.Replace("9", "");
                // brackets
                text = text.Replace("(", "");
                text = text.Replace(")", "");
    
                text = text.Replace("{", "");
                text = text.Replace("}", "");
    
                text = text.Replace("[", "");
                text = text.Replace("]", "");
    
                text = text.Replace("<", "");
                text = text.Replace(">", "");
    
                // slash and pipe
    
                text = text.Replace("/", "");
                text = text.Replace("\\", "");
                text = text.Replace("|", "");
    
                // commas quotes periods 
    
                text = text.Replace("\"", "");
                text = text.Replace("\'", "");
    
                text = text.Replace(".", "");
                text = text.Replace(",", "");
    
                text = text.Replace("!", "");
                text = text.Replace("?", "");
    
                text = text.Replace(";", "");
                text = text.Replace(":", "");
    
                text = text.Replace(",", "");
                text = text.Replace("’", "");
    
                text = text.Replace("“", "");
                text = text.Replace("”", "");
    
                text = text.Replace("`", "");
    
                // symbols
    
                text = text.Replace("~", "");
                text = text.Replace("!", "");
                text = text.Replace("@", "");
                text = text.Replace("#", "");
                text = text.Replace("$", "");
                text = text.Replace("%", "");
                text = text.Replace("^", "");
                text = text.Replace("&", "");
                text = text.Replace("*", "");
                text = text.Replace("-", "");
                text = text.Replace("_", "");
                text = text.Replace("+", "");
                text = text.Replace("=", "");
    
                // remove double space
    
                while (text.IndexOf("  ") != -1)
                {
                    text = text.Replace("  ", " ");
                }
            }
    
            public void make_long_line(string[] long_line)
            {
                string record_sentence = "";
                string temp = "";
                string lower = "";
                int counter = 0;
    
                StringBuilder s = new StringBuilder();
                StreamReader record = new StreamReader("article.txt");
    
                while ((record_sentence = record.ReadLine()) != null)
                {
                    if (record_sentence == "")
                    {
                        temp = s.ToString();
                        remove_Unwanted_chars(ref temp);
                        lower = temp.ToLower();
                        lower = RemoveDuplicateWords(lower);
                        long_line[counter] = lower;
                        write_4(lower);
                        //Console.WriteLine(long_line[counter] + "\r\n");
    
                        s.Clear();
                        counter++;
                    }
                    else
                    {
                        s.Append(record_sentence);
                        s.Append(" ");
                    }
                }
                record.Close();
    
                temp = s.ToString();
                remove_Unwanted_chars(ref temp);
                lower = temp.ToLower();
                lower = RemoveDuplicateWords(lower);
                long_line[counter] = lower;
                write_4(lower);
                //Console.WriteLine(long_line[counter] + "\r\n");
    
                s.Clear();
            }
    
            public void count_paragraphs(ref int counter)
            {
                string record_sentence = "";
                StreamReader record = new StreamReader("article.txt");
    
                while ((record_sentence = record.ReadLine()) != null)
                {
                    if (record_sentence == "")
                    {
                        counter++;
                    }
                }
                record.Close();
    
                counter++;
            }
    
            static void Main(string[] args)
            {
                File.WriteAllText("paragraph_match_count.txt", String.Empty);
                File.WriteAllText("word_list.txt", String.Empty);
                File.WriteAllText("shortened_article.txt", String.Empty);
    
                Program paragraphs = new Program();
    
                int counter = 0;
                paragraphs.count_paragraphs(ref counter);
                string[] long_line = new string[counter];
    
                paragraphs.make_long_line(long_line);
    
                paragraphs.compare(long_line, counter);
            }
        }
    }
    so no more problems for now.
    Last edited by jeremy duncan; 08-03-2017 at 12:12 AM. Reason: edited code and returned results

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Case Sensitivity... matching words in a string.
    By braden87 in forum C Programming
    Replies: 5
    Last Post: 03-13-2010, 11:04 PM
  2. count words
    By 74466 in forum C++ Programming
    Replies: 4
    Last Post: 02-17-2006, 09:30 AM
  3. How to count the words in sentence ?
    By Th3-SeA in forum C Programming
    Replies: 1
    Last Post: 10-01-2003, 01:34 AM
  4. words count
    By arlenagha in forum C++ Programming
    Replies: 2
    Last Post: 03-06-2003, 09:29 AM
  5. Replies: 2
    Last Post: 05-05-2002, 01:38 PM

Tags for this Thread