Thread: Text analysis

  1. #1
    Registered User
    Join Date
    May 2017
    Posts
    3

    Text analysis

    Hi , so i got a project to do for college in C that has to do with text editing , reading from files ,word and character statistics, text correction and such . I got a pretty clear picture on the logic the program is gonna run on but im faceing a basic problem that stops from starting this project.Teacher says that the program must use fixed size arrays ,the problem is that the program must read words , one at a time , until the user inputs a specific word ( eg. "end" ) . So how does that work using fixed size arrays ?What if the user inputs words until it exceeds the limit since the arrays' sizes ara fixed ?

  2. #2
    Registered User rstanley's Avatar
    Join Date
    Jun 2014
    Location
    New York, NY
    Posts
    1,113
    You have given us insufficient information to answer your question. You would need to show us exactly what the teacher gave you for instructions.

  3. #3
    Registered User
    Join Date
    May 2017
    Posts
    3
    Well im trying my best to translate the instructions : the program's basic operation is text input , at first from the user and at later versions from a txt file , so for the first version of the program the user inputs the text , each input is considered as a single word for the text , the input stops when the program reads a word with "end" as its 3 last characters ,so it could be "end" or something like "abcdefgend" its the same thing , the data structure our program uses is fixed size arrays , so the size is predetermined. Thats the instruction for the first version of the program , it seems kind of vague to me he asks for a fixed size while at the same time needs the input to go on until the user inputs "end" .

  4. #4
    Registered User rstanley's Avatar
    Join Date
    Jun 2014
    Location
    New York, NY
    Posts
    1,113
    Could you show a full line that the user might enter?

    What is stored in this fixed array? What is the data type of the array?

    How will this array be used once filled?

  5. #5
    Registered User
    Join Date
    Jun 2015
    Posts
    1,640
    The best thing to do is to ask the teacher for the maximum size of input, both the maximum number of words and the maximum size of a word. If you can't do that, then you have to make the array very large:
    Code:
    #define MAX_WORDS       1000000
    #define MAX_WORD_SIZE        50
    
    // In main (perhaps):
        static char words[MAX_WORDS][MAX_WORD_SIZE];  // 50MB
    If the array is defined inside a function (usually the best option) then you should make it "static" so that it is not stored on the stack. Although be aware that that means the storage will never be deallocated, which is probably okay in your case.

    You also need to ensure that you don't go beyond these bounds, unlikely as that may be (with normal, error free, non-malicious input).

  6. #6
    Registered User
    Join Date
    May 2017
    Posts
    3
    well , the user input simple text (eg. The sky is blue) , the instructions say that the user should a single word at a time so the input for this example line should have been
    The
    sky
    is
    blue
    If the user inputs it all at once i guess its either gonna be treated as single word or we will be asked that to break the sentence later on.So the array stores these words , the data type is char obviously . So how are we gonna use the stuff , once the input of our text has ended , so our dictionary is just another array of words , at first its empty , so after we are done with our text , there is the "correction" option.So how does this work , the program checks every word of our "text" if it exists in our "dictionary" , if it is , nothing happens and it goes on to the next word , if its not , there are a couple of options : 1) replace the word in our"text" with another , which the user inputs.After that the new word is checked by the correction procces.
    2) Addition of that word in the "dictionary"
    3) Do nothing , and continue from the next word.
    4) Stop the correction process.
    So in other words the "correction" process notifies the user each time there is a word in our text that doesnt exist in the "dictionary" (our word database), which is updated with new words by the user at will and is saved everytime when the program ends in a txt file, and asks the user what he want to do each time that happens.
    Also the program can provide the statistics of our text : 1)how many words and characters are in our text
    2)how many different words and characters are in our text
    3)how many 1-letter , 2-letter ,.... etc. words are in our text
    And if the user selects the option to save the text he makes everytime he runs the program it also saves the stats of the text on another file .
    So thats it , the final version of the program will allow the user to choose any of these options , and can also read the "text" from a txt file , instead of reading words until the user types "end" . All these operations are , kind of easy to implement in our program , the only problem is the one i told you . He asks for a fixed size in every array ,though that seems kind of stupid since the user can input "infinite" amount of words , and a txt file can contain any number , so whats the point in asking for a fixed size ? It beats me.

  7. #7
    and the hat of int overfl Salem's Avatar
    Join Date
    Aug 2001
    Location
    The edge of the known universe
    Posts
    39,661
    Well if the arrays are fixed sizes, then when a word is too long, or there are too many words, you either
    - complain and ignore
    - complain and halt

    Making things fixed sizes is a stepping stone in development. It's much easier to deal with
    char words[MAXWORDS][MAXLEN];
    than it is to deal with
    char **words;
    and all the attendant malloc / realloc / free calls needed to make it all work properly.

    Changing the former into the latter is an easy step when the rest of the program logic is in place.

    But to try and keep all the dynamic allocation in step with rapidly changing program logic can be a real PITA.
    A development process
    If you dance barefoot on the broken glass of undefined behaviour, you've got to expect the occasional cut.
    If at first you don't succeed, try writing your phone number on the exam paper.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Beam or Frame Analysis for Structural Analysis
    By greenmania in forum C Programming
    Replies: 3
    Last Post: 05-05-2010, 05:40 PM
  2. Big-O analysis
    By Unregistered in forum C++ Programming
    Replies: 6
    Last Post: 06-26-2002, 01:21 PM
  3. Text analysis problem
    By haz115 in forum C Programming
    Replies: 3
    Last Post: 04-27-2002, 09:52 AM
  4. Text analysis problems - need help!
    By haz115 in forum C Programming
    Replies: 13
    Last Post: 04-21-2002, 04:56 PM
  5. Text analysis problems - need help!
    By haz115 in forum C Programming
    Replies: 1
    Last Post: 04-14-2002, 08:27 AM

Tags for this Thread