Thread: Design + HTML

  1. #1
    Registered User
    Join Date
    Nov 2002
    Posts
    491

    Question Design + HTML

    Background:
    I'm creating a program which will download an HTML document from, theortically any location (via plugins), and take the HTML and conver tot an HTML like XML format.

    Question:
    Should I create my own HTML library for converting to the XML? In theory, this program should be pretty portable (just to various *NIX's right now). If not, is there a pretty good and portable HTML library out there which will help my conversion process?

    Also, in OpenBSD functions are given a _ infront of them, not so in linux (the one I use atleast). I noticed in some configure scripts it checks to see if functions start with a _ or not. How do I deal with a function starting with a _? Would it be something like

    #ifdef HAS_UNDERSCORE
    whatever = dlsym(handle, "_yadda");
    #else
    whatever = dlsym(handle, "yadda");
    #endif

    Or make a function that is a wrapper for dlsym and use something liek #ifdef HAS_UNDERSCORE and adds an underscore and then calls dlsym().
    Or is there a better way to do this?

    Any help you may offer is greatful, I'm trying to get a good design before I start programming so hopefully this project will go by quickly.

    Thanks.

  2. #2
    Me no make sense of program design you are be do...???
    My Avatar says: "Stay in School"

    Rocco is the Boy!
    "SHUT YOUR LIPS..."

  3. #3
    Confused Magos's Avatar
    Join Date
    Sep 2001
    Location
    Sweden
    Posts
    3,145

    Re: Design + HTML

    Originally posted by orbitz
    Question:
    Should I create my own HTML library for converting to the XML?
    Sure!
    MagosX.com

    Give a man a fish and you feed him for a day.
    Teach a man to fish and you feed him for a lifetime.

  4. #4
    Registered User
    Join Date
    Nov 2002
    Posts
    491
    Originally posted by OneStiffRod
    Me no make sense of program design you are be do...???
    Ok..I'll try to spell it out for you
    Question 1)
    Do you know of a good UNIX HTML library that would be helpful in convertion HTML to a form of XML

    Question 2)
    When dealing with dl* functions and .so's in UNIX some OS's put a _ infront of the function names on compile/link, do you know of a good way to deal with this to make my code easily portable


    I hope that helps.

  5. #5
    1) Hmm... I understand better, thanks, but it still doesn't make sense to me - HTML and XML are pretty much similar - there is an XHTML out there already. I think you would need to incorporate an XML parser into your code for this to work.

    try: www.sourceforge.net they do alot of projects like this and you can probably find what you need.

    It would make more sense if you explained the reasoning behind why you want or need to do this - what is the end goal??

    2) I guess you could create a MACRO for this - call your macro in your code and create a #def of UNIX and one of WIN32 or something. The Macro can sort out which function to call based on the #defined OS.
    My Avatar says: "Stay in School"

    Rocco is the Boy!
    "SHUT YOUR LIPS..."

  6. #6
    Registered User
    Join Date
    Nov 2002
    Posts
    491
    HTML doesn't necesarrly equal XML. I need to convert HTML into a form of XML that another program understands. I think all I need is a way to take a stream and get the tags and attributes and what not so I can convert to the XML equivalent. Maybe I could just do all of this with a DOM XML parser??

  7. #7
    Lookup SAX - under www.w3c.org
    Also, look here:
    http://www.jezuk.co.uk/cgi-bin/view/arabica

    I think you still need the DOM style XML parser - you need some sort of parser that will break down the tags for you.
    My Avatar says: "Stay in School"

    Rocco is the Boy!
    "SHUT YOUR LIPS..."

  8. #8
    Guest Sebastiani's Avatar
    Join Date
    Aug 2001
    Location
    Waterloo, Texas
    Posts
    5,708
    Hmm, well, let me know if you don't find a converter, I'd be more than glad to $tamp one out for you.

    And as for the defs, just a little clarification. When you see functions defined like that, it just means that either: the code behind one of them may not work on your machine (OS specific) or else it's just a downgraded function (the scoreless is considered "newer and better"). But the whole point is making it potable, so you shouldn't have to do anything but define something:

    #define WIN_32 /* you put this here if running windows */
    #define SOME_OTHER /* this if the other, etc... */

    // the compiler has the following...

    #ifdef WIN_32
    #define message_function _message_function
    #endif
    #ifdef SOME_OTHER
    #define message_function __message_function
    #endif

    But there could be other reasons too. Could you post the defines?
    Code:
    #include <cmath>
    #include <complex>
    bool euler_flip(bool value)
    {
        return std::pow
        (
            std::complex<float>(std::exp(1.0)), 
            std::complex<float>(0, 1) 
            * std::complex<float>(std::atan(1.0)
            *(1 << (value + 2)))
        ).real() < 0;
    }

  9. #9
    Registered User
    Join Date
    Nov 2002
    Posts
    491
    Originally posted by Sebastiani
    Hmm, well, let me know if you don't find a converter, I'd be more than glad to $tamp one out for you.

    And as for the defs, just a little clarification. When you see functions defined like that, it just means that either: the code behind one of them may not work on your machine (OS specific) or else it's just a downgraded function (the scoreless is considered "newer and better"). But the whole point is making it potable, so you shouldn't have to do anything but define something:

    #define WIN_32 /* you put this here if running windows */
    #define SOME_OTHER /* this if the other, etc... */

    // the compiler has the following...

    #ifdef WIN_32
    #define message_function _message_function
    #endif
    #ifdef SOME_OTHER
    #define message_function __message_function
    #endif

    But there could be other reasons too. Could you post the defines?

    Well the underscore thing is ment by using teh dl* functions in *NIX. in OpenBSD a function in the executable loks like _somefunction, and in linux it loks liek somefunction, so when I use dlsym() to access the function I need an efficient means of handling the openbsd and linux version of things.

    OpenBSD is not the only OS to put a _ in front (I'd assume it relies more on the verison of gcc/ld ???).

    Also, I think I'm just going to use DOM to parse the HTML and then convert the tree to XML as I go along, that should work good right? I hear DOM is more memory intensive, but alot cleaner than SAX.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Web designer needed! html, css and other design stuff...
    By Akkernight in forum Projects and Job Recruitment
    Replies: 0
    Last Post: 04-11-2009, 09:33 AM
  2. Please Help - C code creates dynamic HTML
    By Christie2008 in forum C Programming
    Replies: 19
    Last Post: 04-02-2008, 07:36 PM
  3. Implementing Inheritence into your design
    By bobthebullet990 in forum C++ Programming
    Replies: 6
    Last Post: 08-05-2006, 04:40 PM
  4. Opinions on new site design
    By jverkoey in forum A Brief History of Cprogramming.com
    Replies: 23
    Last Post: 01-21-2005, 01:34 PM