Thread: codeform version 1.2.0 #2

  1. #1
    Frequently Quite Prolix dwks's Avatar
    Join Date
    Apr 2005
    Location
    Canada
    Posts
    8,057

    codeform version 1.2.0 #2

    Codeform, a syntax highlighter [intended] for C and C++.

    Codeform's main page: http://dwks.theprogrammingsite.com/myprogs/codeform.htm
    Direct download link: http://dwks.theprogrammingsite.com/m...n/codeform.zip

    "clip" download link (Windows clipboard utilities): http://dwks.theprogrammingsite.com/m.../down/clip.zip

    Since the other threads were sadly lost in the recent presumed board restore from a backup, I'm starting another codeform thread. With good reason. Here is a list of the things I have done in the past two days:
    • (Incomplete) Rules files for highlighting other rules files and phpbb were added. From names.txt:
      Code:
      cfrule  codeform rules files
      ...
      phpbb  (PhpBB) BBCode (just like vbb but with [color=black] around everything)
    • The static variables in add_rule() were removed into struct rules_t (as struct prevrule_t pr). There are now no global or static variables in codeform, though there is a rather complex structure that is passed to most functions (at least in part). Oh well.
    • [bugfix] (related to the static variables) A rare memory leak occured when the last rule already existed:
      Code:
      =keyword
      int:[:]
      int:[[:]]
      It was fixed by freeing prevrule_t from main() with a call to the new function free_prevrule().
    • Several spelling mistakes in the source file were fixed. Some undoubtable remain.
    • [bugfix] Many keywords starting with the same characters would cause some to be ignored under certain conditions due to a problem with find_rule_new():
      Code:
      =keyword
      x:[:]
      do:*:*
      done:*:*
      double:*:*
      doubles:*:*
      doubled:*:*
      doubling:*:*
    • [bugfix] A variable directly following an undefined variable would be ignored, due to an oversight in find_var_replace():
      Code:
      =keyword
      int:$(formbb)$(keyworddarkc)$(formba):$(forma)
    • [bugfix] A memory leak of at least BUFSIZ characters per input file occured. (The leak was the longest line in the file as a multiple of BUFSIZ, rounded up. So if BUFSIZ was 512 and there was a line in the file 530 characters long, 1024 characters would have been leaked.) It was fixed by adding free(line) into read_file().
    • [bugfix] A memory leak occured for every parameter passed to the program. free_argument() was calling free() instead of free_strings().
    • [bugfix] When reading rules files with DOS-style newlines under Linux, repeated sections ("*") were taken as literal "*"s. (This was manifest with codeform online.) chomp_newline() was added which chomps '\r's as well as '\n's.
    • [bugfix] One byte too many was being allocated for every variable name in add_var9).
    • One call to strlen() was eliminated by making remove_escapes() cooperate with shrink_string().

    There are still some things to fix, however: Valgrind still shows a few hundred bytes leaked for very complex rules files (down from several thousand, thanks to all of the memory leak bug fixes). (It's the first time I've ever used Valgrind; it's amazing.) I'm pretty sure there's something up with the previous-rule code in add_rule() et al.

    HOWTO.txt is sort of out of date.

    I want to add Perl support (a rather large undertaking, don't expect it within days ).

    Also see the TODO list for other things that I still need to do:
    Code:
    /*-------------------------------------------*\
     | TODO list for future versions of codeform |
    \*-------------------------------------------*/
    
    Short-term
    - ? Optimise rule searching by looking in the previous position first
    - Add support for "\n" and "\param" in specific comments ("" and /*! */)
    - Make a "function" rule: \w+\s+\(
    - ? Don't print closing tags for same-coloured nested comments
    - Provide a default rules directory that is searched first
    - Make "escaped newline" (\) character specifiable
    - Don't count #es in #defines as new comments;
        don't allow nestcoms to include themselves
    [done] - Allow multiple prev pointers!
    [done] - free static variables in add_rule()
    
    Long-term
    - Make input, output and styles rules files separate
    
    Distant long-term
    - Support indent-style code beautifying
    I haven't forgotten about the Win32 functions, don't worry, but the lists are gone now so it will take me a while . . . .

    Codeform online still isn't working. Any ideas? How can I run a program as a restricted user from a priviledged Perl script?
    Last edited by dwks; 03-23-2007 at 05:55 PM.
    dwk

    Seek and ye shall find. quaere et invenies.

    "Simplicity does not precede complexity, but follows it." -- Alan Perlis
    "Testing can only prove the presence of bugs, not their absence." -- Edsger Dijkstra
    "The only real mistake is the one from which we learn nothing." -- John Powell


    Other boards: DaniWeb, TPS
    Unofficial Wiki FAQ: cpwiki.sf.net

    My website: http://dwks.theprogrammingsite.com/
    Projects: codeform, xuni, atlantis, nort, etc.

  2. #2
    Registered User kryptkat's Avatar
    Join Date
    Dec 2002
    Posts
    638
    Would you like me to put the win api list back up?

  3. #3
    Frequently Quite Prolix dwks's Avatar
    Join Date
    Apr 2005
    Location
    Canada
    Posts
    8,057
    Sure. Or you could create the codeform rules files yourself if you're feeling up to it.

    Perhaps something like this:
    Code:
    =keyword
    
    keyworddarkb=
    keyworddarka=
    
    auto:$(keyworddarkb):$(keyworddarka)
    bool:*:*
    break:*:*
    case:*:*
    ...
    In fact, I have a Perl script for you:
    Code:
    #!/usr/bin/perl
    
    print "=keyword\n\nkeyworddarkb=\nkeyworddarka=\n\n";
    chomp($l = <>);
    print "$l:\$(keyworddarkb):\$(keyworddarka)\n";
    
    while(<>) {
        chomp;
        print "$_:*:*\n";
    }
    Note: it's untested. [edit] Okay, now I've edited the code and tested it. It works pretty well. It assumes an input file with one keyword per line. [/edit]
    Last edited by dwks; 03-23-2007 at 06:18 PM.
    dwk

    Seek and ye shall find. quaere et invenies.

    "Simplicity does not precede complexity, but follows it." -- Alan Perlis
    "Testing can only prove the presence of bugs, not their absence." -- Edsger Dijkstra
    "The only real mistake is the one from which we learn nothing." -- John Powell


    Other boards: DaniWeb, TPS
    Unofficial Wiki FAQ: cpwiki.sf.net

    My website: http://dwks.theprogrammingsite.com/
    Projects: codeform, xuni, atlantis, nort, etc.

  4. #4
    Registered User kryptkat's Avatar
    Join Date
    Dec 2002
    Posts
    638
    Code:
    =keyword
    
    keyworddarkb=
    keyworddarka=
    
    auto:$(keyworddarkb):$(keyworddarka)
    bool:*:*
    break:*:*
    case:*:*
    ...
    i have no clue what that is or does.... <i know it is perl but....>

    Code:
    #!/usr/bin/perl
    
    print "=keyword\n\nkeyworddarkb=\nkeyworddarka=\n\n";
    $l = <>;
    print "$l:$(keyworddarkb):$(keyworddarka)\n"
    
    while(<>) {
        print "$_:*:*\n";
    }
    have no perl compiler....

  5. #5
    Frequently Quite Prolix dwks's Avatar
    Join Date
    Apr 2005
    Location
    Canada
    Posts
    8,057
    I'll run it for you then. (It's a Perl interpreter BTW.) Post the list (as an attachment!) if you have it handy. If not, I'll start looking . . . .
    dwk

    Seek and ye shall find. quaere et invenies.

    "Simplicity does not precede complexity, but follows it." -- Alan Perlis
    "Testing can only prove the presence of bugs, not their absence." -- Edsger Dijkstra
    "The only real mistake is the one from which we learn nothing." -- John Powell


    Other boards: DaniWeb, TPS
    Unofficial Wiki FAQ: cpwiki.sf.net

    My website: http://dwks.theprogrammingsite.com/
    Projects: codeform, xuni, atlantis, nort, etc.

  6. #6
    Registered User kryptkat's Avatar
    Join Date
    Dec 2002
    Posts
    638
    put it all on one file.txt



    note to self.... dx9 and gdi need to add

  7. #7
    Frequently Quite Prolix dwks's Avatar
    Join Date
    Apr 2005
    Location
    Canada
    Posts
    8,057
    Here it is . . . generated by this:
    Code:
    #!/usr/bin/perl
    
    print "=keyword\n\nkeyworddarkb=\nkeyworddarka=\n\n";
    chomp($l = <>);
    $l =~ s/\s*\(?\s*$//;
    print "$l:\$(keyworddarkb):\$(keyworddarka)\n";
    
    while($l = <>) {
        chomp($l);
        $l =~ s/\(?\s*$//;
        if($l =~ /\/\*(.*)\*\// || $l =~ /(. Functions)/) {
            print "# $1\n";
        }
        elsif($l ne '') {
            print "$l:*:*\n";
        }
    }
    I call it rules/winapi on my machine. It's used like so:
    Code:
    $ ./codeform -f rules/cpp_vbb_1 -f rules/winapi ...
    It's untested BTW. Let me know if you have problems with it. The latest upload of codeform segfaults quite often for no apparent reason . . . not a good sign.

    [edit] Stupid attachments. You can't have a file with no extension . . . remove the .txt. [/edit]
    Last edited by dwks; 03-23-2007 at 07:00 PM.
    dwk

    Seek and ye shall find. quaere et invenies.

    "Simplicity does not precede complexity, but follows it." -- Alan Perlis
    "Testing can only prove the presence of bugs, not their absence." -- Edsger Dijkstra
    "The only real mistake is the one from which we learn nothing." -- John Powell


    Other boards: DaniWeb, TPS
    Unofficial Wiki FAQ: cpwiki.sf.net

    My website: http://dwks.theprogrammingsite.com/
    Projects: codeform, xuni, atlantis, nort, etc.

  8. #8
    The superhaterodyne twomers's Avatar
    Join Date
    Dec 2005
    Location
    Ireland
    Posts
    2,273
    Might be worth putting in some documentation about the clipboard programs too so people can use your code formatter along with your clipboard thing.

  9. #9
    Registered User kryptkat's Avatar
    Join Date
    Dec 2002
    Posts
    638
    tested it. it hangs. let it run all night just to see if it was slow or not working. 9pm to 7:30am is enough time to know that it crashed.

    first thought colins on math functions
    second thought two rule files
    third thought comments in file.
    fourth thought perl script messed up
    fifth thought multiple references of the same functions

    manually edit win api file retry with manually made rule file.

    other thoughts as to why it crashed?

    [edit]
    how large of .c file can it handle?
    [/edit]
    Last edited by kryptkat; 03-24-2007 at 05:26 AM.

  10. #10
    Registered User kryptkat's Avatar
    Join Date
    Dec 2002
    Posts
    638
    care to try that again?

    here is the list sorted with the duplicates removed and the comments removed. it is ready for a perl script to turn it to a winapi rule file.

    note there is gdi in there hbitmap and bitblit i think all of it .

  11. #11
    Frequently Quite Prolix dwks's Avatar
    Join Date
    Apr 2005
    Location
    Canada
    Posts
    8,057
    first thought colins on math functions
    second thought two rule files
    third thought comments in file.
    fourth thought perl script messed up
    fifth thought multiple references of the same functions
    Colons in function names would definitely mess codeform up -- they need to be escaped.
    Comments in file beginning with '#' are fine -- "/**/" comments are not. I think I got rid of them though.
    The Perl script may well have messed up.
    Multiple functions of the same name would cause the duplicates to be ignored (only the first one counts).

    [edit]
    how large of .c file can it handle?
    [/edit]
    At least 64 KB -- that's the largest I've tested -- but I know for a fact that its line length is theoretically unlimited (5,000,000 character lines on an old computer). If you find a larger source file, feel free to test it. There's no reason why a larger file would crash it however.

    I expect the problems you encountered were due to a bug in the codeform I uploaded . . . I've re-uploaded it again, having done the following changes last night (it was a late night...):
    • Changes
    • The line in add_var()
      Code:
      vars->data[pos]->from = malloc(eq - str);
      becomes
      Code:
      vars->data[pos]->from = malloc(eq - str + 1);
      Despite the reassuring comment, the previous version did have a buffer overflow and I suspect that was what was causing the segmentation faults.
    • free_onerule()'s free() goes inside the if statement -- might make codeform very slightly faster
    • Added
      Code:
      && x <= ruledata->data[y]->data.number
      to free_onerule()'s inner if statement -- it might do something on rare occasions.
    • (minor change) In add_rule, the return value of the call to add_allocated_rule() is saved and pr->freep is set to it.
    • Fixed yet another memory leak due to a bug in add_rule().
    • This line is added to free_dup_onerule() to get rid of a small memory leak when keywords of the same name were added:
      Code:
      free(rule->data.len);
    • Re-written free_rulecdat() completely to get rid of a serious memory leak to look like this:
      Code:
      void free_rulecdat(const struct rules_t *rules) {
          enum type_t x;
          size_t y;
      
          for(x = 0; x < TYPES; x ++) {
              if(rules->cdat[x]) {
                  for(y = 0; y < rules->cdat[x]->data.number; y ++) {
                      free(rules->cdat[x]->data.data[y]);
                  }
                  
                  free(rules->cdat[x]->data.data);
                  free(rules->cdat[x]->data.len);
                  free(rules->cdat[x]);
              }
          }
      }

    I did a few other things too, not worth mentioning here.

    Valgrind now reports no memory leaks for codeform! Even for rules/c_1_css. But it does report "invalid read"s, whatever that is. (Enlighten me?)

    I will indeed re-create the winapi file and test it this time (I'm on the right computer). I'll report back in a few minutes.
    dwk

    Seek and ye shall find. quaere et invenies.

    "Simplicity does not precede complexity, but follows it." -- Alan Perlis
    "Testing can only prove the presence of bugs, not their absence." -- Edsger Dijkstra
    "The only real mistake is the one from which we learn nothing." -- John Powell


    Other boards: DaniWeb, TPS
    Unofficial Wiki FAQ: cpwiki.sf.net

    My website: http://dwks.theprogrammingsite.com/
    Projects: codeform, xuni, atlantis, nort, etc.

  12. #12
    Frequently Quite Prolix dwks's Avatar
    Join Date
    Apr 2005
    Location
    Canada
    Posts
    8,057
    Okay, download the latest version; rules/winapi is included in it. I tested it with this command line and it worked:
    Code:
    C:\>codeform -f rules\_html -f rules\c_1ext -f rules\winapi rules\winapi
    It increased the size of the .ZIP by a bit.

    [edit] For nicer colours try
    Code:
    C:\>codeform -f rules\_html -f rules\c_1 -f rules\winapi rules\winapi
    Here's a sample (with _vbb of course instead of _html):
    Code:
    =keyword
    
    keyworddarkb=
    keyworddarka=
    
    AbnormalTermination:$(keyworddarkb):$(keyworddarka)
    AbortDoc:*:*
    AbortPath:*:*
    AbortPrinter:*:*
    AbortProc:*:*
    ABORTPROC:*:*
    AbortSystemShutdown:*:*
    accept:*:*
    AcceptEx:*:*
    AccessNtmsLibraryDoor:*:*
    ACMDRIVERENUMCB:*:*
    ACMDRIVERPROC:*:*
    ACMFILTERCHOOSEHOOKPROC:*:*
    ACMFILTERENUMCB:*:*
    ACMFILTERTAGENUMCB:*:*
    ACMFORMATCHOOSEHOOKPROC:*:*
    ACMFORMATENUMCB:*:*
    ACMFORMATTAGENUMCB:*:*
    ...etc...
    [/edit]

    The executable included in the .ZIP is not compiled with my normal compiler but it should still work.

    The linux executable included is ancient. If you're using linux just compile it yourself. Type "make clean" and then "make".
    Last edited by dwks; 03-24-2007 at 02:09 PM.
    dwk

    Seek and ye shall find. quaere et invenies.

    "Simplicity does not precede complexity, but follows it." -- Alan Perlis
    "Testing can only prove the presence of bugs, not their absence." -- Edsger Dijkstra
    "The only real mistake is the one from which we learn nothing." -- John Powell


    Other boards: DaniWeb, TPS
    Unofficial Wiki FAQ: cpwiki.sf.net

    My website: http://dwks.theprogrammingsite.com/
    Projects: codeform, xuni, atlantis, nort, etc.

  13. #13
    Registered User kryptkat's Avatar
    Join Date
    Dec 2002
    Posts
    638
    If you find a larger source file, feel free to test it.
    <very big grin>

    was looking over your changes....good to know. i do not have linux.

    the winapi file is large and i hope it will add to the progs popularity. it has helped fix a few progs already. if i had realized you were going to run it through a prog to make the rule file i would have plucked cleaned stuffed and baked the file in the first place. thank you.

  14. #14
    Frequently Quite Prolix dwks's Avatar
    Join Date
    Apr 2005
    Location
    Canada
    Posts
    8,057
    I used another Perl program that basically takes codeform.c and copies it 100 times into another file. Here are the results:
    Code:
    timeit: 14285 ms
    
     Volume in drive [edited]
     Volume Serial Number is [edited]
    
     Directory of [edited]
    
    24/03/2007  03:59 PM         7,098,100 bigfile
    24/03/2007  03:59 PM        20,343,374 bigfile.htm
                   2 File(s)     27,441,474 bytes
                   0 Dir(s)      56,410,112 bytes free
    Not bad . . . A 7MB file was processed in 14.3 seconds (admittedly, it was in the background but this is a really fast computer).

    <very big grin>

    was looking over your changes....good to know. i do not have linux.

    the winapi file is large and i hope it will add to the progs popularity. it has helped fix a few progs already. if i had realized you were going to run it through a prog to make the rule file i would have plucked cleaned stuffed and baked the file in the first place. thank you.
    Thank you for the list. All I did was write a program to parse it and tested and uploaded it . . . you did all of the work.

    Sorry that the change list is not very legible. I wrote it as I was coding and never intended to put it on the internet or anything.
    dwk

    Seek and ye shall find. quaere et invenies.

    "Simplicity does not precede complexity, but follows it." -- Alan Perlis
    "Testing can only prove the presence of bugs, not their absence." -- Edsger Dijkstra
    "The only real mistake is the one from which we learn nothing." -- John Powell


    Other boards: DaniWeb, TPS
    Unofficial Wiki FAQ: cpwiki.sf.net

    My website: http://dwks.theprogrammingsite.com/
    Projects: codeform, xuni, atlantis, nort, etc.

  15. #15
    Woof, woof! zacs7's Avatar
    Join Date
    Mar 2007
    Location
    Australia
    Posts
    3,459
    DWKS, perhaps look into the <pre></pre> tags and keep tabs?, Rather than converting them to spaces...?

    Think about it, its an extra 3 bytes per tab... It all adds up. Not to mention it makes the highlighted source easier to "cut & paste" while keeping its original formatting.

    Anywho just thought i'd comment.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Problem building Quake source
    By Silvercord in forum Game Programming
    Replies: 16
    Last Post: 07-11-2010, 09:13 AM
  2. No Version info tab in file properties?
    By cpjust in forum Windows Programming
    Replies: 2
    Last Post: 06-03-2008, 03:42 PM
  3. How to set File Version of VC++ 6 dll
    By mercury529 in forum Windows Programming
    Replies: 3
    Last Post: 12-08-2006, 02:49 PM
  4. Finding the windows version...
    By The_Muffin_Man in forum Windows Programming
    Replies: 1
    Last Post: 06-10-2004, 11:39 PM
  5. Dev C++ Version 5
    By Zoalord in forum C++ Programming
    Replies: 3
    Last Post: 08-30-2003, 01:56 PM