Thread: HTML2ASCII Conversion

  1. #1
    Registered User
    Join Date
    Jul 2002
    Posts
    913

    HTML2ASCII Conversion

    im tyring to write some functions to take valid html key codes and turn them into ascii. right now i have

    Code:
    int htmlcode2char(char data[]) {
        char value[4] = { '\0' };
            int x = 0;
        char *spot = strrchr(data, ';');
        int y = 2;
        
        while(data[y] < *spot) {
            value[x] = data[y];
            
            ++x;
            ++y;
        }
        
        return ( atoi(value) );
    }
    
    int html2char(char data[]) {
        int key;
        
        if( strcmp(data, "<P>") == 0 )
            key = '\n';
        else if( strcmp(data, "&#32;&#32;&#32;&#32;&#32;") == 0 )
            key = '\t';
        else
            key = htmlcode2char(data);
            
        return key;
    }
    this works for single html characters, but what about strings? how could i take a html string and return its ascii equivalent? searching every bit seems slow(since i would have to find a &# then a number and then a ; , which could be there by coincidence), any one have a better idea?

  2. #2
    ATH0 quzah's Avatar
    Join Date
    Oct 2001
    Posts
    14,826
    Personally, I'd do it like so:

    scan for a <
    read into a buffer until you find a >
    compare the buffer contents with a table of valid strings.

    Additionally, you may want to consider parsing the first "word" off of the buffer, and then just compare that. That would likely be the best bet.

    Quzah.
    Hope is the first step on the road to disappointment.

  3. #3
    Registered User
    Join Date
    Jul 2002
    Posts
    913
    its not used alot, but what about the "html ascii"? i mean, they could have "A" or "&#65;". i might be a perfectionist here, but i cant come up with a good way to do it.

  4. #4
    Registered User
    Join Date
    Jul 2002
    Posts
    913
    i started to work on the functions opposite(i still think there must be a better way) :

    Code:
    int char2html(int data) {
    	if(data > 31 && data < 128) { 
    		switch(data) {	          
    			/* Invalid data. */
    			case '\0':
    				return 0; 
    			
    			/* Escape characters. */
    			case '\n':
    				return "<P>";
    			case '\t':
    				return "&#32;&#32;&#32;&#32;&#32;";
    			
    			/* Normal ASCII. */
    			case ' ':
    				return "&#32;";
    			case '<':
    				return "&#60;";
    			case '>':
    				return "&#62;";
    		}
    	} else
    		return 0;
    
    	return data;
    }
    i get warning alot by the returns, why? if i put a * infron of it, it compiles with no warnings. What it the probelm, i can do int someVar = "<P>", why cant i return the same kind of int?

  5. #5
    ATH0 quzah's Avatar
    Join Date
    Oct 2001
    Posts
    14,826
    "<" is not an integer. It is a string. Your function returns integers.

    Quzah.
    Hope is the first step on the road to disappointment.

  6. #6
    Registered User
    Join Date
    Jul 2002
    Posts
    913
    i changed it alittle so it doesnt have to return the data, but now it doesnt all ways work.

    Code:
    int char2html(int data, char *dest) {
    	if(data > 31 && data < 128) { /* Check for valid data. */
    		switch(data) {
    			/* Escape characters. */
    			case '\n':
    				strcpy(dest, "<P>");
    				break;
    			case '\t':
    				strcpy(dest, "&#32;&#32;&#32;&#32;&#32;");
    				break;
    				
    			/* Normal ASCII. */
    			case ' ': /* Attemp to keed formating. */
    				strcpy(dest, "&#32;");
    				break;
    			case '<': /* Can cause problems. */
    				strcpy(dest, "&#60;");
    				break;
    			case '>': /* Can cause problems. */
    				strcpy(dest, "&#62;");
    				break;
    		}
    	} else 
    		return 1;
    
    	return data;
    }
    it works fine if i give it something like '<' but a '\n' does nothing, why?

  7. #7
    ATH0 quzah's Avatar
    Join Date
    Oct 2001
    Posts
    14,826
    You don't really need that if check. Just use a standard switch. Have a default case that returns 1. Additionally, you should explain what exactly isn't working. Just saying "It doesn't work some times." doesn't tell use what isn't working. It's too vague. If you learn one lesson in programming, it's to be specific.

    Quzah.
    Hope is the first step on the road to disappointment.

  8. #8
    Registered User
    Join Date
    Jul 2002
    Posts
    913
    it works fine if i give it something like '<' but a '\n' does nothing, why?
    if i run this i just get " Test". i should get "<P> Test". why doesnt it print anything at all?

    Code:
    int main() {
        char temp[50] = { '\0' };
        char2html('\n', temp);
        
        printf("%s Test\n", temp);
    
    	return 0;
    }
    and i cant just have one default, theres normal letters that are identical and then theres some ascii chars that arent in html, i have to attempt to find them.

  9. #9
    ATH0 quzah's Avatar
    Join Date
    Oct 2001
    Posts
    14,826
    I stand corrected. You were specific, I just didn't see it.

    Here's your problem:

    printf("%d", '\n' );

    Like I said, remove the if check and just use a default case.

    Quzah.
    Hope is the first step on the road to disappointment.

  10. #10
    Registered User
    Join Date
    Jul 2002
    Posts
    913
    i dont see how printf("%d", '\n' ); helps me.

  11. #11
    ATH0 quzah's Avatar
    Join Date
    Oct 2001
    Posts
    14,826
    Originally posted by mart_man00
    i dont see how printf("%d", '\n' ); helps me.
    If you had actually printed it, you would have seen. But alas, I must implicitly explain my every comment to everyone:

    printf("%d", '\n' );

    This displays the number 10.

    Look at your if check:

    if(data > 31 && data < 128) { /* Check for valid data. */

    Now, I don't know about you, but to me, 10 seems to be less than 31. To explain that further, the if check fails. To take that another step further, since the if check fails, the switch is never executed. Next, since the switch is never executed, the case statement never executes. Finally, since the case statement never executes, nothing is ever copied into the buffer. Now do you understand?

    Quzah.
    Hope is the first step on the road to disappointment.

  12. #12
    Registered User
    Join Date
    Jul 2002
    Posts
    913
    o sorry about quzah, it was to obvious.

    i added some more code to my "header" and i cant get a warning to go away, heres the file.

    Code:
    #include <stdio.h>
    #include <string.h>
    
    int main() {
        char temp[] = "\ntesting";
        char temp2[50] = { '\0' };
        
        string2html(temp, temp2);
        printf("%s", temp2);
    
    	return 0;
    }
    
    int string2html(char *data, char *dest) {
        int x = 0;
        
        while(data[x] != '\0') {
            char2html(data[x], dest);
            
            ++x;
        }
        
        return 0;
    }
    
    int char2html(int data, char *dest) {
    	if( (data == 10 || data == 9) || (data > 31 && data < 128) ) { /* Check for valid data. */
    		switch(data) {
    			/* Escape characters. */
    			case '\n':
    				strcpy(dest, "<P>");
    				break;
    			case '\t':
    				strcat(dest, "&#32;&#32;&#32;&#32;&#32;");
    				break;
    				
    			/* Normal ASCII. */
    			case ' ': /* Attemp to keed formating. */
    				strcat(dest, "&#32;");
    				break;
    			case '<': /* Can cause problems. */
    				strcat(dest, "&#60;");
    				break;
    			case '>': /* Can cause problems. */
    				strcat(dest, "&#62;");
    				break;
                default:
                    strcat(dest, data);
    		}
    	} else 
    		return 1; /* Error. */
    
    	return data;
    the "strcat(dest, data);" part in my default case gives me a warning of "[Warning] passing arg 2 of `strcat' makes pointer from integer without a cast", how can i get it to go away? how is my functions going now?

    quzah, im sure your not going to like the "if( (data == 10 || data == 9) || (data > 31 && data < 128) )" part, but i dont like the idea of not attempting to check out all.

  13. #13
    ATH0 quzah's Avatar
    Join Date
    Oct 2001
    Posts
    14,826
    Originally posted by mart_man00
    quzah, im sure your not going to like the "if( (data == 10 || data == 9) || (data > 31 && data < 128) )" part, but i dont like the idea of not attempting to check out all.
    There is no point in having your if check. You're just making your program do unneeded work. Consider the following:
    Code:
    if( x < 10 )
    {
        switch( x )
        {
            case 11: printf("Do something."); break;
        }
    }
    Ok, so we have two checks. We check to see if it's less than 10, if not, we then check to see if it's 11. If we pass x as 9, it does one check and quits.
    Code:
    switch( x )
    {
        case 11: printf("Do something."); break;
        default: printf("Invalid.");
    }
    There ya go. No need for the if check. You are checking it. You're checking it against each case, you don't need to do the additional checking, because the switch encompasses it.

    Either way works, there just is no need for the if check.

    The reason you get the warning is because you're using strcat wrong. Strcat does not take integers. It takes character pointers as arguments.

    Quzah.
    Hope is the first step on the road to disappointment.

  14. #14
    Registered User
    Join Date
    Jul 2002
    Posts
    913
    what can i do about my strcat error? i cant change the int data to a char data with about more warnings, and then id have to use extra functions to compare. why wont strcat let me get away with a character? a cast doesnt even want to work.

  15. #15
    ATH0 quzah's Avatar
    Join Date
    Oct 2001
    Posts
    14,826
    Originally posted by mart_man00
    what can i do about my strcat error? i cant change the int data to a char data with about more warnings, and then id have to use extra functions to compare. why wont strcat let me get away with a character? a cast doesnt even want to work.
    Use the function correctly, that's what you do about it. You apparently don't know what strcat does. Let me illustrate:

    strcat( buffer, thisstring );

    This will appent thisstring to the end of the buffer. The buffer must have enough room to actually append the string to it.

    You cannot just randomly assign different parameters to functions. You must give a function what it expects. An integer is not treated like a string. If you must pass a number to it, you should do something like:

    strcat( buffer, atoi( number ) );

    Which will basicly turn the number into a string, and then pass that string as the argument to the function.

    Again, you must give a function exactly what it needs for parameters. You cannot simply typecast a number as a string and expect it to work.

    Keep in mind that in C there are no true strings, so in effect what happens is you, by type cating, are saying that that number is a memory address which points to a "string".

    That's why it doesn't work.

    Quzah.
    Hope is the first step on the road to disappointment.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Screwy Linker Error - VC2005
    By Tonto in forum C++ Programming
    Replies: 5
    Last Post: 06-19-2007, 02:39 PM
  2. Dikumud
    By maxorator in forum C++ Programming
    Replies: 1
    Last Post: 10-01-2005, 06:39 AM
  3. Header File Question(s)
    By AQWst in forum C++ Programming
    Replies: 10
    Last Post: 12-23-2004, 11:31 PM
  4. Do I have a scanf problem?
    By AQWst in forum C Programming
    Replies: 2
    Last Post: 11-26-2004, 06:18 PM
  5. Creation of Menu problem
    By AQWst in forum C Programming
    Replies: 8
    Last Post: 11-24-2004, 09:44 PM