Thread: C Programmer, please help me!!!

  1. #1
    Registered User
    Join Date
    May 2003
    Posts
    4

    C Programmer, please help me!!!

    Hi!!!

    I'm needing to change a "very simple" behavior of an opensource C based software, but I haven't any experience on C language.

    Ok man, let's go!

    The program is "HTML Tidy" and you can obtain the source code
    and reference material respectively at:

    - http://tidy.sourceforge.net/src/tidy_src.tgz

    - http://tidy.sourceforge.net

    This is a great program to tidy and format your dirty html.

    The problem is that it doesn't work if it encounter unknown elements(tags)
    in html. Then you must declare these unknown
    elements(tags) for properly works, what isn't a good idea in our XML days.

    obs: there is a file called "parser.c" at src dir, might the solution is there. Piece of code extracted from it ....
    Code:
            /* ignore unknown start/end tags */
            if ( node->tag == NULL )
            {
                ReportWarning( doc, element, node, DISCARDING_UNEXPECTED );
                FreeNode( doc, node );
                continue;
            }
    Thus I need help to suppress this unwanted behavior, making it accept any
    unknown elements(tags).

    I believe that isn't hard to change the piece of code to accomplish it, but
    as I said, I'm ignorant in the C world.

    Can you understand me???


    Thanks in advance,

    marcoBR

    btw, sorry 4 my english...
    Last edited by marcoBR; 05-24-2003 at 12:28 PM.

  2. #2
    Code Goddess Prelude's Avatar
    Join Date
    Sep 2001
    Posts
    9,897
    >Thus I need help to suppress this stupid behavior, making it accept any
    unknown elements(tags).
    And what were you planning on doing with these unknown elements? There's a good reason for that 'stupid behavior'; you can't reasonably expect the program to handle unknown entities intelligently, so you assume that they shouldn't be there and report an error. This is how compilers work as well, if the compiler finds something that it doesn't recognize, it doesn't just leave it there and keep going as it's probably erroneous.
    My best code is written with the delete key.

  3. #3
    Registered User
    Join Date
    May 2003
    Posts
    4
    Hey, at least it could have an option to permit unknown elements... it should be very useful.

    Assume you don't know what are the elements, e.g a third party XHTML or XML file, thus how to register these elements???

    For maintain a beauty code with all registered elements it's a valid approach, but it lacks on functionality, if you can undestand me, hope.

    The case is: Would be so hard to change it to accept any unknown elements??!?
    Last edited by marcoBR; 05-24-2003 at 01:49 PM.

  4. #4
    Been here, done that.
    Join Date
    May 2003
    Posts
    1,164
    My take is you wish to simply ignore all unknown tags and continue processing. But what the program does is for each unknown tag it displays a warning and removes the tag (and the data between I assume). Is this correct?

    Walt

  5. #5
    Code Goddess Prelude's Avatar
    Join Date
    Sep 2001
    Posts
    9,897
    >The case is: Would be so hard to change it to accept any unknown elements??!?
    It depends on what you consider hard. With little knowledge of C I would suggest finding another pretty printer that handles XML.
    My best code is written with the delete key.

  6. #6
    Registered User
    Join Date
    May 2003
    Posts
    4
    Excellent WaltP, you understood me perfectly!

    But what the program does is for each unknown tag it displays a warning and removes the tag (and the data between I assume)
    Worse of this, it displays an 'error message' and presents a blank file as output. If you use the config option --force-output, then it removes the unknown tags and all their contents.

    It seems HTML Tidy, at the moment, claims for beauty wanting to format inclusive the "unkwnown tags" for it.

    Is there any way to change the code, so that it will NOT
    discard tags it doesn't recognize, but it will still Tidy what it does
    recognize?

    Would be so hard to change it to accepting any unknown tags, letting the beauty for them aside?!?!?
    Last edited by marcoBR; 05-24-2003 at 12:33 PM.

  7. #7
    Been here, done that.
    Join Date
    May 2003
    Posts
    1,164
    Originally posted by marcoBR Is there any way to change the code, so that it will NOT
    discard tags it doesn't recognize, but it will still Tidy what it does
    recognize?
    Of course.

    Not knowing the code, I would assume if it finds '<' it assumes a tag. It then tests the next few characters and if they are not in a list of accepted values, the error is displayed and the rest of the tag is read and tossed (or worse).

    What you need to do is after comparing the next few characters, if they don't match known tags, output what you just read and simply continue without going thru the "tidy" portion of the code.
    Definition: Politics -- Latin, from
    poly meaning many and
    tics meaning blood sucking parasites
    -- Tom Smothers

  8. #8
    Pursuing knowledge confuted's Avatar
    Join Date
    Jun 2002
    Posts
    1,916
    WaltP - He doesn't know C. He's looking for someone to write code for him...which probably isn't very likely.
    Away.

  9. #9
    Registered User
    Join Date
    May 2003
    Posts
    4
    I'm still needing this patch... any help?!??!

  10. #10
    Been here, done that.
    Join Date
    May 2003
    Posts
    1,164
    Originally posted by marcoBR
    I'm still needing this patch... any help?!??!
    We've tried to help, but you won't give us the information we need to help. I said in my last post:
    Not knowing the code, I would assume...
    Maybe you don't realize it, but noone can give you what you want without knowing what you have. You're asking us to rewrite your program without knowing the program. I gave you what help I could, it's now up to you.

    And as confuted implied, we're not going to write the program for you.
    Definition: Politics -- Latin, from
    poly meaning many and
    tics meaning blood sucking parasites
    -- Tom Smothers

  11. #11
    and the hat of int overfl Salem's Avatar
    Join Date
    Aug 2001
    Location
    The edge of the known universe
    Posts
    39,659
    Vacuous 5 month old post bump - closed
    If you dance barefoot on the broken glass of undefined behaviour, you've got to expect the occasional cut.
    If at first you don't succeed, try writing your phone number on the exam paper.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. What game programmer should I be? need some advice.
    By m3rk in forum A Brief History of Cprogramming.com
    Replies: 10
    Last Post: 04-20-2009, 11:12 PM
  2. When are you REALLY a programmer?
    By m.mixon in forum C Programming
    Replies: 5
    Last Post: 07-19-2006, 09:08 PM
  3. Senior 3D Programmer - Moab Studios™ LLC
    By moab in forum Projects and Job Recruitment
    Replies: 0
    Last Post: 08-30-2005, 10:19 PM
  4. Me as a programmer?
    By Cheeze-It in forum A Brief History of Cprogramming.com
    Replies: 12
    Last Post: 03-31-2002, 06:19 PM
  5. I need to interview professional programmer.....please
    By incognito in forum C++ Programming
    Replies: 1
    Last Post: 01-05-2002, 02:46 PM