Understanding the pseudocode/logic for this program

This is a discussion on Understanding the pseudocode/logic for this program within the C Programming forums, part of the General Programming Boards category; Hello. I need to write a program that reads in a .html file, and prints out only the html tags ...

  1. #1
    Registered User
    Join Date
    Sep 2006
    Posts
    10

    Understanding the pseudocode/logic for this program

    Hello. I need to write a program that reads in a .html file, and prints out only the html tags according to the order they are displayed, the catch is the program must check to make sure the tags are balanced as well, if they aren't an error message needs to be printed. I *must* use stack and queues for this. I have the basic understanding of how the program would work, but I'm slightly thrown off, especially with the queue.

    Originally I thought the logic would be along the lines of: opening the html file, scan through untill a "<" is found, add that tag to the stack and continue untill the matching closing tag is found, if found remove from the stack and print to the output. However, it has to be printed as it would show on a webpage, ie:

    <head>
    <title>
    </title>
    </head>
    <body>
    <h1>
    </h1>
    </body>

    so the method I'd use wouldn't work as the head and body tags are seperated, which I know is where the queue plays it's role but I just can't think of how I'd incorporate that. Any help or suggestions would be greatly appreciated. (and please, no actual code, just pseudocode or just general tips)

  2. #2
    Mad OnionKnight's Avatar
    Join Date
    Jan 2005
    Location
    Umeň, Sweden
    Posts
    555
    so the method I'd use wouldn't work as the head and body tags are seperated,
    I don't see why that would cause any problems. You don't have to terminate just because the stack got empty, quit when you've reached the end of the file instead.

    The stack would be like

    head
    head -> title
    head -> title -> /title -> /head
    The last element is the closing tag for the first element so perform a check to see if the first element matches with the closing last element, the second element matching with the closing second to last element and so on until you hit the middle of the stack. Then go through the stack and output all the tags and then clear the stack of contents and start all over again until the file has hit the end.

    [EDIT] Oops, screwed up pretty bad. I blame the clock for being 5 AM and the coffee making my stomach go bad.
    Last edited by OnionKnight; 12-07-2006 at 09:06 PM.

  3. #3
    Registered User
    Join Date
    Nov 2006
    Posts
    176
    can you use 2 queues, 1 for the output, 1 for use with a stack for load balancing?

    so you would read tags in, checkin validity, adding to both queues

    then as Onion suggested Load balance with the stack, (keep adding untill you find a closing tag)
    if you do, the top of the stack should be the opening one...if not theres an error
    keep going till queue 1 is exausted...if stack is empty as well...they are balanced

    take each tag out of queue 2 and print as they come...and thats the order you got them in

  4. #4
    CSharpener vart's Avatar
    Join Date
    Oct 2006
    Location
    Rishon LeZion, Israel
    Posts
    6,484
    the html in general gives a possibility mix tags
    and it should all be inside <html></html> tags

    for example
    <html><head></head><body><H1></H1><b>Test <i>of</b> HTML</i></body></html>
    The first 90% of a project takes 90% of the time,
    the last 10% takes the other 90% of the time.

  5. #5
    CSharpener vart's Avatar
    Join Date
    Oct 2006
    Location
    Rishon LeZion, Israel
    Posts
    6,484
    Ps and don't forget about attributes
    <body backgroundcolor=0xFF></body>

    At least in Release Notes you should mention that you don't support them if don't want
    The first 90% of a project takes 90% of the time,
    the last 10% takes the other 90% of the time.

  6. #6
    ATH0 quzah's Avatar
    Join Date
    Oct 2001
    Posts
    14,826
    Also don't forget about tags which close at the same time they begin:
    Code:
    <img src="foo" />
    Like so.


    Quzah.
    Hope is the first step on the road to disappointment.

  7. #7
    CSharpener vart's Avatar
    Join Date
    Oct 2006
    Location
    Rishon LeZion, Israel
    Posts
    6,484
    it is xml/html style
    the plain html gives a possibility to use this tags like
    <br>

    without closing tag and without / in it
    The first 90% of a project takes 90% of the time,
    the last 10% takes the other 90% of the time.

  8. #8
    Registered User
    Join Date
    Nov 2006
    Posts
    176
    is it only <b> <i> that may be intertwined? (I'm guessing theres a <u> aswell)
    I assume these are bold and italic from varts earlier post.
    if they are, do they even have to be closed?
    If not I'd wouldn't even read them as other tags

  9. #9
    Registered User
    Join Date
    Sep 2006
    Posts
    10
    thanks for the reminder on the tags that don't close/close at the same time. Wasn't even crossing my mind about those.

  10. #10
    Registered User
    Join Date
    Nov 2006
    Posts
    65
    I am not an html type of guy but from what I know tags are consistant in that a tag will be declared [tag] and end the same name [\tag] the only difference is the / so you can group them with ease by scanning through the html code and and look for [name (chars) and a closing [/name (chars) (you will need to make a way to correct the error of a tag with the same names used twice in a program, i got a few in my head but it doesnt help you to blurt out things). Now if my knowledge is correct you will not in html use / more then once in a tag. Its just an idea and maybe a place to start.
    You rant and rave about it, but at the end of the day, it doesn't matter if people use it as long as you don't see.
    People are free to read the arguments, but if the only way for you to discover gravity is by jumping off a cliff, then that is what you're going to have to experience for yourself.
    Eventually, this "fast and loose" approach of yours will bite you one too many times, then you'll figure out the correct way to do things. - Salem

  11. #11
    Registered User
    Join Date
    Sep 2006
    Posts
    10
    Alright, thanks for all the help as far as the logic goes everyone, I appreciate it alot.

    I have one last question and I'll be on my way (our book gives some great examples of stacks and queues being used, should be good on that part). Opening up the html file is easy, the next part is proving to be a pain in the ass. Like Onion and Sl4nted pointed out, storing the tags into a queue and then using a stack for load balancing is a great way to go about it, I'm having a few issues initially getting the tags into the queue. Is fgets the best way to go about scanning in the tags?

  12. #12
    CSharpener vart's Avatar
    Join Date
    Oct 2006
    Location
    Rishon LeZion, Israel
    Posts
    6,484
    Is fgets the best way to go about scanning in the tags?
    I don't think so... Html does not have to have new line symbols in the file... Everithing can be on one line
    Line breaks are considered as regular white spaces except inside <pre> tag
    The first 90% of a project takes 90% of the time,
    the last 10% takes the other 90% of the time.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Need help with a program, theres something in it for you
    By engstudent363 in forum C Programming
    Replies: 1
    Last Post: 02-29-2008, 12:41 PM
  2. Replies: 4
    Last Post: 02-21-2008, 09:39 AM
  3. Using variables in system()
    By Afro in forum C Programming
    Replies: 8
    Last Post: 07-03-2007, 12:27 PM
  4. My program, anyhelp
    By @licomb in forum C Programming
    Replies: 14
    Last Post: 08-14-2001, 10:04 PM

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21