Thread: Word Wrap Program

  1. #1
    Registered User
    Join Date
    Jun 2004
    Posts
    14

    Question Word Wrap Program

    Hi, I'm new to the forums and just started learning the C programming language.

    I tried making my own program that wraps paragraphs to a certain column width and changes "tabs" to a certain number of spaces, but although it seems like it should be working perfectly, it does not. The output makes a newline after the first letter, and then no more new lines are created. The Tab feature does work correctly though.

    I've looked through the code many times, and I have no idea what could be wrong. Although it would probably be a pain in the ass to look at the code for the first time, I thought that I have nothing to lose by posting it.

    It works by measuring the distance between space characters and seeing if it is over the column width. Here's the code:

    Code:
    #include <stdio.h>
    
    #define TABSTOP 5
    #define COLUMNWIDTH 20
    #define MAXCHARACTERS 5000
    #define SPACE 32
    #define MAXSPACES 4000
    #define MAXLINES 4000
    
    main(){
    
    /* Definition of Variables */
    
        char characters[MAXCHARACTERS];
        char spaces[MAXSPACES];
        char enters[MAXLINES];
        int i, c, j, f, p, q, t, z;
        
        
    /* Gets characters & spaces and correlates each with a variable in an array, */
    /* including transferring tabs to the write number of spaces */
    
        p = 1;
        spaces[p] = 0;
        
        for (i=0; (c=getchar())!=EOF; ++i) {
            if (c == '\t') {
                    i-=1;
                    for (j=0; j<=TABSTOP; ++j) {
                                    characters[i] = SPACE;
                                    ++i;
                    }
            } else {             
                    characters[i] = c; 
            }
            if (c == SPACE) {
                    ++p;
                    spaces[p] = i;
            }
        }
        if (c == EOF) {
            ++i;
            characters[i] = '\0';
            ++p;
            spaces[p] = i;
        }
        
        z = p;
    
        putchar('\n');
        
        
    /* Measures Distance between Spaces and finds newline points */
        
        p = 1;
        q = 2;
        t = 1;
    
        while (spaces[q]<=(z)) {
            while ((spaces[q]-spaces[p]) <= COLUMNWIDTH) {
                    ++q;
            }
            if ((spaces[q]-spaces[p]) > COLUMNWIDTH) {
                    enters[t] = spaces[q-1];
                    ++t;
                    p = q-1;
            }
        }
    
    /* Prints paragraph with newline points*/
    
        t = 2;
    
        for (i=0; characters[i]!='\0' && i<MAXCHARACTERS; ++i) {
            putchar(characters[i]);
            if (i == enters[t]) {
                    putchar('\n');
                    ++t;
            }
        }
    }
    Any idea why it may not be working or any ideas how I could organize it better (through functions maybe?), or even criticism, is greatly appreciated.

    Thanks,
    Edan.
    Last edited by EdanD; 06-27-2004 at 07:56 PM.

  2. #2
    Code Goddess Prelude's Avatar
    Join Date
    Sep 2001
    Posts
    9,897
    Don't you think an interactive read and print would be both easier and more flexible?
    Code:
    #include <stdio.h>
    
    #define TAB_STOP     5
    #define COLUMN_WIDTH 5
    
    int wrap(int column_index);
    
    int
    main(void)
    {
      int ch;
      int column_index = 0;
    
      while ((ch = getchar()) != EOF) {
        if (ch == '\t') { /* Replace tab character with spaces */
          int n;
          for (n = 0; n < TAB_STOP; n++) {
            putchar(' ');
            column_index = wrap(column_index + 1);
          }
        }
        else {            /* Print the character as is */
          putchar(ch);
          column_index = wrap(column_index + 1);
        }
      }
    
      return 0;
    }
    
    int
    wrap(int column_index)
    {
      if (column_index == COLUMN_WIDTH) { /* Wrap the word */
        putchar('\n');
        return 0;
      }
    
      return column_index;
    }
    Your code has a few problems that I looked at. The first is that the initial for loop lacks braces and your indention suggests that the body should be a compound statement. The second is that z is uninitialized yet you still use it. Past that I didn't look too hard because your choice of variable names is horrid and makes the code obtuse.
    Last edited by Prelude; 06-27-2004 at 06:54 PM.
    My best code is written with the delete key.

  3. #3
    Registered User
    Join Date
    Jun 2004
    Posts
    14
    Prelude,

    Thanks for your post and your help. You're right-- my code is messy, and I need to find a way to organize it better. There were a couple errors you touched on that I found and fixed, but I am still getting the same result in the output.

    Your code is simple and organized, in that it prints each character as it is read. However, the main reason why my program was not done in this manner, was because in order for the code to know where to print a new line, it has to look at the characters ahead, because the word may not fit in the line (new line points can only occur at the spaces-- words can't be split in half). Your code splits words in half if need be, which wasn't the full purpose of my program. In order to achieve the effect of looking at future characters to assess if a new line is necessary, I had to assign all the characters to variables, and then those variables could be assessed in the second half of the program.

    Understand what I'm saying? If so, can you take a second look at my code and see where I may have gone wrong, or post a second version of your code but so that words cannot be split in half?

    Thanks.

  4. #4
    Quote Originally Posted by EdanD
    I tried making my own program that wraps paragraphs to a certain column width and changes "tabs" to a certain number of spaces,
    <...>
    I have had a look to your code. Well, it's hard to read, mainly due to a bad choice of identifiers. Also, the algorithm seems too complex for the job. Also, there are serious security holes, and defining big arrays on the automatic memory can produce an undefined behaviour.

    Actually, you have 2 problems to solve.

    1 - expanding the TABs
    2 - wrapping the text

    The input data is a character stream (getchar()). The output is also a character stream (putchar()). In between, you have the process. Because the output depends on both the input and some current status, it's a good candidate for a finite state machine (FSM).

    FSM are extremely powerful devices that are able to solve any sequencial problem. Thay also tend to make the design clear, because you have to defines the elements:

    - Events (inputs)
    - Status
    - Actions (outputs)
    Emmanuel Delahaye

    "C is a sharp tool"

  5. #5
    Registered User
    Join Date
    Jun 2004
    Posts
    14
    Okay, how/where do I start learning about Finite State Machines and how to impliment it into my code?
    Last edited by EdanD; 06-28-2004 at 08:23 AM.

  6. #6
    ATH0 quzah's Avatar
    Join Date
    Oct 2001
    Posts
    14,826
    Google's a good start. Throw a few more keywords in if you feel like it...
    Code:
    while( state != StateExit )
    {
        switch( state )
        {
            case StateInitialize:
                ...load stuff...
                ...set state to something else...
            break;
    
            case StateNewUser:
                ...create new user...
                ...set state to something else...
            break;
    
            case StateLogin:
                ...login user...
                ...set state to something else...
            break;
    
            case StateDoSomething:
                ...do something...
                ...set state to something else...
            break;
    
            case StateShutdown:
                state = StateExit;
            break;
        }
    }
    There's a simple one. Poor example, but it will work for illustartion. Personally, I don't see the need for a FSM in what you're doing. (See Prelude's example for the lack of need for a FSM.)

    Basicly, you something happens based on the current state of the machine. Somewhere along the line of your program doing things, the state gets changed, and it then switches focus on doing something else. The switch-in-a-loop is a quick and easy example of a FSM.

    Quzah.
    Hope is the first step on the road to disappointment.

  7. #7
    Code Goddess Prelude's Avatar
    Join Date
    Sep 2001
    Posts
    9,897
    >Understand what I'm saying?
    Yes. I specifically chose not to address that issue though because you didn't mention it in your question and your code was difficult to follow with the uninformative variable names. But now that I know, I'll give you an idea and pose a question.

    idea: Keep careful track of the characters you read with getchar, then when you hit a non-whitespace character, ungetc the one you have and then read a word. From there you can calculate the size of the word and how it fits into the row (or doesn't ). Here's a quick example, bugs may exist (note to everyone else: yes, I know that scanf doesn't have its return value checked ):
    Code:
    #include <ctype.h>
    #include <stdio.h>
    #include <string.h>
    
    #define TAB_STOP     5
    #define COLUMN_WIDTH 10
    
    size_t wrap(size_t column_index);
    
    int
    main(void)
    {
      int    ch;
      size_t column_index = 0;
      char   buffer[10];
      size_t len;
    
      while ((ch = getchar()) != EOF) {
        if (ch == '\t') {                      /* Replace tab character with spaces */
          int n;
          for (n = 0; n < TAB_STOP; n++) {
            putchar('#');
            column_index = wrap(column_index + 1);
          }
        }
        else if (!isspace(ch)) {               /* Save non-space characters */
          ungetc(ch, stdin);
          scanf("%9s", buffer);
          len = strlen(buffer);
          if (column_index + len > COLUMN_WIDTH) {
            column_index = wrap(COLUMN_WIDTH); /* Force a wrap */
          }
          printf("%s", buffer);
          column_index = wrap(column_index + len);
        }
        else {                                 /* Handle other whitespace */
          putchar('*');
          column_index = wrap(column_index + 1);
        }
      }
    
      return 0;
    }
    
    size_t
    wrap(size_t column_index)
    {
      if (column_index == COLUMN_WIDTH) { /* Wrap the word */
        putchar('\n');
        return 0;
      }
    
      return column_index;
    }
    question: What happens if a word is entered that has more characters than a row?

    >how/where do I start learning about Finite State Machines and how to impliment it into my code?
    A FSM is simpler than the name suggests, you can find plenty of information with a search on google.
    My best code is written with the delete key.

  8. #8
    Registered User
    Join Date
    Jun 2004
    Posts
    14
    Aaah awesome. Thanks for your help guys... I'm going to disect Prelude's code. I might have more questions.

    Prelude, when I put a word that was over the column width in your code, it split the word in half-- which is what I think the program should do.

    What bothered me was that with your code it didn't create a new line to enter another paragraph of text... but that should be easily fixed. I'll fix that and repost it too (I need the learning experience).

    Thanks!

  9. #9
    Registered User
    Join Date
    Jun 2004
    Posts
    14
    Okay Question number 1: What's up with "size_t" in your program Prelude? You treat it like its a variable definition.

    (sorry for the double post)

  10. #10
    ATH0 quzah's Avatar
    Join Date
    Oct 2001
    Posts
    14,826
    size_t is defined in stddef.h. It is an unsigned ingeteral type variable. It is commonly defined as unsigned long, though it doesn't have to be so long as it meets the unsigned requirement. The minimum and maximum values are defined in stdint.h in C99.

    Quzah.
    Hope is the first step on the road to disappointment.

  11. #11
    Registered User
    Join Date
    Jun 2004
    Posts
    14
    Thanks Quzah.

    If I'm right,
    Code:
          ungetc(ch, stdin);
          scanf("%9s", buffer);
          len = strlen(buffer);
    measures the length of the word. Can someone explain to me exactly what's happening here? what's "ungetc" (goes backwards through characters?), "stdin", and "strlen"?

    Thanks.
    Last edited by EdanD; 06-28-2004 at 01:13 PM.

  12. #12
    Code Goddess Prelude's Avatar
    Join Date
    Sep 2001
    Posts
    9,897
    >Can someone explain to me exactly what's happening here?
    At this point in the program, you have a non-whitespace character in ch. Naturally, that's something that you want to be a part of the word, but we would have to place it in the buffer manually because we've already extracted it from the stream. ungetc(ch, stdin) takes the character ch and pushes it back onto the stream stdin such that the value of ch will be the next character read from stdin (like with a call to getchar()). That way, scanf will read all of the word instead of the word with the first character chopped off.

    Now that the stream is in the right state, scanf is used to read all non-whitespace characters and place them in buffer. Because buffer is only 10 characters long and we need to consider the null character to terminate the string, scanf is forced to read at most 9 characters. This is a safety factor because scanf is very good at overflowing buffers.

    Once buffer contains a string, we use strlen to get the number of characters in that string and place the value in len for later use in the calculations.
    My best code is written with the delete key.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Using variables in system()
    By Afro in forum C Programming
    Replies: 8
    Last Post: 07-03-2007, 12:27 PM
  2. BOOKKEEPING PROGRAM, need help!
    By yabud in forum C Programming
    Replies: 3
    Last Post: 11-16-2006, 11:17 PM
  3. An interesting challenge --> A word search program
    By desipunjabi in forum C Programming
    Replies: 5
    Last Post: 11-12-2005, 03:30 PM
  4. Word Wrap
    By osal in forum Windows Programming
    Replies: 4
    Last Post: 07-02-2004, 11:16 AM
  5. Word Count
    By simple in forum C Programming
    Replies: 12
    Last Post: 10-04-2002, 10:47 PM