Thread: Advices about an exercise.

  1. #1
    Registered User
    Join Date
    Jul 2009
    Location
    Croatia
    Posts
    272

    Advices about an exercise.

    Write a program to "fold" long input lines into two or more shorter lines after the last non-blank character that occurs before the n-th column of input. Make sure your program does something intelligent with very long lines, and if there are no blanks or tabs before the specified column.
    Okay, so i just did this one; with dynamic allocation and other nifty stuff which i find usefull (as detabing the line before folding).

    I think this program can be really usefull if you have like huge and long lines in notepad for example or text editor and you want to to shorten it.

    It is not limited by line size (uses dynamically allocated buffer).

    So i'd like your opinion if there's something i missed/did wrong in my code, or any suggestions of how i could improve the code . I'm still learning so any advice or critisicm is welcome. Thanks!

    Code:
    #include <stdio.h>
    #include <stdlib.h>
    #include <string.h>
    
    #define FOLD_COLUMN 20
    
    int last_word = 0;
    
    int next_line_length()
    {
        int c, i;
        
        i = 0;
        while((c = getchar()) != EOF && c != '\n')
           i++;
        
        if(i > 0 && c == EOF)
           last_word = 1;
        return (c == EOF && !i) ? EOF : i;
    }
    
    void getline(char *p)
    {
         int c;
         
         while((c = getchar()) != EOF && c != '\n')
            *p++ = c;
         *p = '\0';
    }
    
    void fold_line(char *line, int len)
    {
         int iterate_through_text(char *s, int pos);
         
         int i;
         int nl_pos = 0;    /* 1 position after the spot where we put newline */
         int no_blanks = 1; /* check if there are no blanks/tabs before
                               the specified column */
                               
         while(nl_pos + FOLD_COLUMN < len) { /* need not to check last FOLD_COLUMN characters */
             for(i = nl_pos + FOLD_COLUMN; i >= nl_pos; i--) {
                 if(line[i] == ' ' || line[i] == '\t') {
                     line[i] = '\n';
                     nl_pos = i + 1;
                     no_blanks = 0;
                     break;
                 }
             }
             if(no_blanks)
                 nl_pos = iterate_through_text(line, nl_pos + FOLD_COLUMN);
             no_blanks = 1;
         }
    }
    
    int iterate_through_text(char *s, int pos)
    {
        while(s[pos] != ' ' && s[pos] != '\t' && s[pos] != '\0')
           pos++;
        if(s[pos]) 
           s[pos] = '\n';
        return pos + 1;
    }
    
    #define TAB_STOP 8
    
    int detab_length(char *s, int len)
    {
         int i, column;
         int add_len;
         
         column = add_len = 0;
         for(i = 0; i < len; i++)
             if(s[i] == '\t') {
                  add_len += TAB_STOP - (column % TAB_STOP);
                  column = 0;
             } else
                   column = (s[i] == '\n') ? 0 : column + 1;
                   
         return add_len;
    }
    
    void detab(char *to, char *from, int len)
    {
         int i, j, k;
         int column;
         
         column = j = 0;
         for(i = 0; i < len; i++)
             if(from[i] == '\t') {
                  for(k = 0; k < TAB_STOP - (column % TAB_STOP); k++)
                     to[j++] = ' ';
                  column = 0;
             } else {
                   to[j++] = from[i];
                   column = (from[i] == '\n') ? 0 : column + 1;
             }
         to[j] = '\0';
    }
    int main()
    {
         int i, c;
         int alloc;            /* bytes needed to be allocated for the line */
         char *buf = NULL;     /* stores the line */
         int new_alloc;        /* bytes needed to be allocated for the detabed line */
         char *detabed = NULL; /* stores the line, but detabed */
         int pos = 0;          /* position in the file at which starts the next line */
         
         while((alloc = next_line_length()) != EOF) {
             buf = (char *) malloc(alloc + 1); /* alloc + 1 for the \0 character */
             fseek(stdin, pos, SEEK_SET);      /* go back to start + pos positions in the file */
             getline(buf);                     /* store the line */
             if((new_alloc = alloc + detab_length(buf, alloc)) != alloc) { /* detab it only if length changed */
                detabed = (char *) malloc(new_alloc + 1);
                detab(detabed, buf, alloc);
                fold_line(detabed, new_alloc);
                printf("%s", detabed);
             } else {
                    fold_line(buf, alloc);
                    printf("%s", buf);
             }
             pos += (alloc + 2);               /* increase pos by alloc + 2 (\n and next pos) */
             if(!last_word)
                putchar('\n');
         }
         return 0;
    }
    For example, do programmers usually check if fseek fails? Or allocation?

    Should i add that?
    Last edited by Tool; 06-08-2010 at 07:27 AM.

  2. #2
    C++まいる!Cをこわせ!
    Join Date
    Oct 2007
    Location
    Inside my computer
    Posts
    24,654
    Where are the frees? I see malloc, but I don't see free.
    Quote Originally Posted by Adak View Post
    io.h certainly IS included in some modern compilers. It is no longer part of the standard for C, but it is nevertheless, included in the very latest Pelles C versions.
    Quote Originally Posted by Salem View Post
    You mean it's included as a crutch to help ancient programmers limp along without them having to relearn too much.

    Outside of your DOS world, your header file is meaningless.

  3. #3
    Registered User
    Join Date
    Jun 2009
    Posts
    486
    You should always check that your memory allocation worked, and add an abort call if it fails. Otherwise you will end up trying to use memory that isn't allocated and it will crash anyway.
    for example, if you want an n-1 character string:
    Code:
    char *ptr;
    if ((ptr = malloc(n*sizeof(char))) == NULL)
    {
         printf("Memory allocation for ptr failed\n");
         abort();
    }
    Last edited by KBriggs; 06-08-2010 at 07:40 AM.

  4. #4
    Registered User
    Join Date
    May 2010
    Location
    Naypyidaw
    Posts
    1,314
    probably you may want to read 'fold' source code.
    man fold
    Last edited by Bayint Naung; 06-08-2010 at 07:50 AM.

  5. #5
    C++まいる!Cをこわせ!
    Join Date
    Oct 2007
    Location
    Inside my computer
    Posts
    24,654
    There are other functions that are just as dangerous, too. Any function that writes into a buffer provided as a parameter MUST have a size argument, as well. Be sure to CHECK to make you don't write outside the buffer's length.
    Quote Originally Posted by Adak View Post
    io.h certainly IS included in some modern compilers. It is no longer part of the standard for C, but it is nevertheless, included in the very latest Pelles C versions.
    Quote Originally Posted by Salem View Post
    You mean it's included as a crutch to help ancient programmers limp along without them having to relearn too much.

    Outside of your DOS world, your header file is meaningless.

  6. #6
    Registered User
    Join Date
    May 2010
    Location
    Naypyidaw
    Posts
    1,314
    Reading twice to find out how many to allocate is not quite good.
    Here is a good implementation.

  7. #7
    spurious conceit MK27's Avatar
    Join Date
    Jul 2008
    Location
    segmentation fault
    Posts
    8,300
    Quote Originally Posted by Elysia View Post
    Any function that writes into a buffer provided as a parameter MUST have a size argument, as well.
    Not if that is already accounted for in the flow of execution, which it is here. I don't believe Tool was trying to write a set of general purpose library functions.

    Personally I would do that part (reading the line and allocating a buffer) somewhat differently, so as to not have to bother with FSEEK at all. However that would require strlen and strcpy so would probably be no more efficient.

    Quote Originally Posted by Bayint Naung View Post
    Reading twice to find out how many to allocate is not quite good.
    Here is a good implementation.
    That one is pretty lame IMO. It reallocs every 20 bytes, which is a complete waste of time. If you use a 1k stack buffer and fgets->strlen->malloc->strcpy you probably won't have to realloc at all, much less 3 or 4 times per line.
    Last edited by MK27; 06-08-2010 at 08:04 AM.
    C programming resources:
    GNU C Function and Macro Index -- glibc reference manual
    The C Book -- nice online learner guide
    Current ISO draft standard
    CCAN -- new CPAN like open source library repository
    3 (different) GNU debugger tutorials: #1 -- #2 -- #3
    cpwiki -- our wiki on sourceforge

  8. #8
    C++まいる!Cをこわせ!
    Join Date
    Oct 2007
    Location
    Inside my computer
    Posts
    24,654
    Quote Originally Posted by MK27 View Post
    Not if that is already accounted for in the flow of execution, which it is here. I don't believe Tool was trying to write a set of general purpose library functions.
    That is irrelevant. It's a huge security risk nevertheless. To be sure, pass the real size of the buffer to the function and perform a security check. Don't rely on your logic to make sure it's correct. That will lead to loads and loads of bugs and a security nightmare.
    Quote Originally Posted by Adak View Post
    io.h certainly IS included in some modern compilers. It is no longer part of the standard for C, but it is nevertheless, included in the very latest Pelles C versions.
    Quote Originally Posted by Salem View Post
    You mean it's included as a crutch to help ancient programmers limp along without them having to relearn too much.

    Outside of your DOS world, your header file is meaningless.

  9. #9
    Registered User
    Join Date
    Jul 2009
    Location
    Croatia
    Posts
    272
    probably you may want to read 'fold' source code.
    Dont see a source code there.

    Well, im preety satisfied by this version of mine, i did just for learning purposes anyway, not really to use it for myself. If i want a folding program ill go to a unix shell and use fold(1) i guess.

    Where are the frees? I see malloc, but I don't see free.
    Okay, i added this.

    And the error checking. I also return -1 in case of alloc failure:

    Code:
    int main()
    {
         int i, c;
         int alloc;            /* bytes needed to be allocated for the line */
         char *buf = NULL;     /* stores the line */
         int new_alloc;        /* bytes needed to be allocated for the detabed line */
         char *detabed = NULL; /* stores the line, but detabed */
         int pos = 0;          /* position in the file at which starts the next line */
         
         while((alloc = next_line_length()) != EOF) {
             if((buf = (char *) malloc(alloc + 1)) == NULL) { /* alloc + 1 for the \0 character */
                 fprintf(stderr, "alloc failure\n");
                 return -1;
             }
             fseek(stdin, pos, SEEK_SET);      /* go back to start + pos positions in the file */
             getline(buf);                     /* store the line */
             if((new_alloc = alloc + detab_length(buf, alloc)) != alloc) { /* detab it only if length changed */
                if((detabed = (char *) malloc(new_alloc + 1)) == NULL) {
                   fprintf(stderr, "alloc failure\n");
                   return -1;
                }
                detab(detabed, buf, alloc);
                fold_line(detabed, new_alloc);
                printf("%s", detabed);
                free(detabed);
             } else {
                    fold_line(buf, alloc);
                    printf("%s", buf);
             }
             pos += (alloc + 2);               /* increase pos by alloc + 2 (\n and next pos) */
             if(!last_word)
                putchar('\n');
             free(buf);
         }
         return 0;
    }
    Reading twice to find out how many to allocate is not quite good.
    Here is a good implementation.
    Could you describe that one a bit? Im not preety much still sure what that code does exactly. Reallocs as it reads characters?

  10. #10
    Registered User
    Join Date
    May 2010
    Location
    Naypyidaw
    Posts
    1,314
    Quote Originally Posted by MK27 View Post
    Not if that is already accounted for in the flow of execution, which it is here. I don't believe Tool was trying to write a set of general purpose library functions.

    Personally I would do that part (reading the line and allocating a buffer) somewhat differently, so as to not have to bother with FSEEK at all. However that would require strlen and strcpy so would probably be no more efficient.



    That one is pretty lame IMO. It reallocs every 20 bytes, which is a complete waste of time. If you use a 1k stack buffer and fgets->strlen->malloc->strcpy you probably won't have to realloc at all, much less 3 or 4 times per line.
    (In production code, a line like nchmax += 20 can prove troublesome, as the function may do lots of reallocating. Many programmers favor multiplicative reallocation, e.g. nchmax *= 2, although it obviously isn't quite as self-starting, and can run into problems if it has to allocate a huge array but memory is limited.)
    Do you read all?

  11. #11
    Registered User
    Join Date
    Jul 2009
    Location
    Croatia
    Posts
    272
    Well;

    i had another idea for allocating space for lines.

    We could read the whole file and determine the longest line. And then just allocate a buffer with the MAX space so any next line will have enough space to be stored into.

  12. #12
    Registered User
    Join Date
    Jul 2009
    Location
    Croatia
    Posts
    272
    That is irrelevant. It's a huge security risk nevertheless. To be sure, pass the real size of the buffer to the function and perform a security check. Don't rely on your logic to make sure it's correct. That will lead to loads and loads of bugs and a security nightmare.
    You mean, instead of len variable, i should pass sizeof(buf) instead?

    Yeah, good idea imo.

    & btw, someone said they wouldnt use fseek.

    Why?

    However that would require strlen and strcpy so would probably be no more efficient.
    Could you describe a bit more? Wouldnt this take much more code? Since you're using 2 more functions.

    Also, if i'd use fgets, well. Then id have to check if the n - 1 character was a \n. Well, imo it would take more code then this way.

    Is fseek "bad"?
    Last edited by Tool; 06-08-2010 at 08:17 AM.

  13. #13
    spurious conceit MK27's Avatar
    Join Date
    Jul 2008
    Location
    segmentation fault
    Posts
    8,300
    Quote Originally Posted by Elysia View Post
    That is irrelevant. It's a huge security risk nevertheless.
    Malarky:
    Code:
         int pos = 0;  
         while((alloc = next_line_length()) != EOF) {
             buf = (char *) malloc(alloc + 1); /* alloc + 1 for the \0 character */
             fseek(stdin, pos, SEEK_SET);      /* go back to start + pos positions in the file */
             getline(buf);                     /* store the line */
    [....]
             pos += (alloc + 2);               /* increase pos by alloc + 2 (\n and next pos) */
    That should be bulletproof. There is no security risk. If you find this confusing or uncertain that's your own flawed logic, not the code's.
    C programming resources:
    GNU C Function and Macro Index -- glibc reference manual
    The C Book -- nice online learner guide
    Current ISO draft standard
    CCAN -- new CPAN like open source library repository
    3 (different) GNU debugger tutorials: #1 -- #2 -- #3
    cpwiki -- our wiki on sourceforge

  14. #14
    spurious conceit MK27's Avatar
    Join Date
    Jul 2008
    Location
    segmentation fault
    Posts
    8,300
    Quote Originally Posted by Tool View Post
    You mean, instead of len variable, i should pass sizeof(buf) instead?
    You can't pass sizeof(buf), it's a pointer. If you want the (irrational) double saftey pass alloc.
    C programming resources:
    GNU C Function and Macro Index -- glibc reference manual
    The C Book -- nice online learner guide
    Current ISO draft standard
    CCAN -- new CPAN like open source library repository
    3 (different) GNU debugger tutorials: #1 -- #2 -- #3
    cpwiki -- our wiki on sourceforge

  15. #15
    C++まいる!Cをこわせ!
    Join Date
    Oct 2007
    Location
    Inside my computer
    Posts
    24,654
    Quote Originally Posted by MK27 View Post
    Malarky:
    Code:
         int pos = 0;  
         while((alloc = next_line_length()) != EOF) {
             buf = (char *) malloc(alloc + 1); /* alloc + 1 for the \0 character */
             fseek(stdin, pos, SEEK_SET);      /* go back to start + pos positions in the file */
             getline(buf);                     /* store the line */
    [....]
             pos += (alloc + 2);               /* increase pos by alloc + 2 (\n and next pos) */
    That should be bulletproof. There is no security risk. If you find this confusing or uncertain that's your own flawed logic, not the code's.
    This is a big security risk, because getline doesn't check to make it doesn't overwrite the buffer. Pass the size along, alloc + 1.

    Quote Originally Posted by MK27 View Post
    You can't pass sizeof(buf), it's a pointer. If you want the (irrational) double saftey pass alloc.
    Better safe than sorry.

    @OP: You cannot use sizeof when you're using malloc. Furthermore, sizeof ONLY works in the function you created the original array in. Otherwise it will simply return the size of a pointer.
    Quote Originally Posted by Adak View Post
    io.h certainly IS included in some modern compilers. It is no longer part of the standard for C, but it is nevertheless, included in the very latest Pelles C versions.
    Quote Originally Posted by Salem View Post
    You mean it's included as a crutch to help ancient programmers limp along without them having to relearn too much.

    Outside of your DOS world, your header file is meaningless.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Replies: 2
    Last Post: 05-02-2010, 01:49 AM
  2. programming exercise
    By mashour06 in forum C Programming
    Replies: 1
    Last Post: 06-01-2009, 06:22 AM
  3. Line of data input method
    By larry_2k4 in forum C Programming
    Replies: 2
    Last Post: 04-28-2009, 11:34 PM
  4. Tutorial review
    By Prelude in forum A Brief History of Cprogramming.com
    Replies: 11
    Last Post: 03-22-2004, 09:40 PM
  5. Request for comments
    By Prelude in forum A Brief History of Cprogramming.com
    Replies: 15
    Last Post: 01-02-2004, 10:33 AM