PDA

View Full Version : Some odd characters showing on web page



DavidP
07-02-2007, 11:51 PM
I've been working on my web page...and there are some odd characters showing up on one of the pages (right below the main picture):

http://students.cs.byu.edu/~dpru/index.php?page=1

I was just wondering if anyone knows what might cause something like this. I looked at all the invisible characters/whitespace characters in the file while editing it...and I can't find any reason why these characters are displayed.

vart
07-03-2007, 12:27 AM
Do you have a printscreen? I do not see any problem with IE 6.0

PS. Actually I see it after clicking on Programming tab... Cannot find a reason for it

vart
07-03-2007, 12:39 AM
On the second thougth:
Here is the hex display of the source page


00000005E0: 20 20 3C 21 2D 2D 20 4D │ 41 49 4E 20 50 41 47 45 <!-- MAIN PAGE
00000005F0: 20 41 52 45 41 20 48 45 │ 52 45 20 2D 2D 3E 0D 0A AREA HERE -->♪◙
0000000600: 20 20 20 20 0D 0A 09 09 │ EF BB BF 3C 62 72 3E 3C ♪◙○○я╗┐<br><
0000000610: 62 72 3E 0D 0A 3C 62 3E │ 50 72 6F 6A 65 63 74 73 br>♪◙<b>Projects
0000000620: 3C 2F 62 3E 0D 0A 0D 0A │ 3C 62 72 3E 3C 62 72 3E </b>♪◙♪◙<br><br>

See these 3 strange characters there?

hk_mp5kpdw
07-03-2007, 06:31 AM
I see some odd characters as well, don't need to view the source in hex however, it's right there plain as day.


<!-- MAIN PAGE AREA HERE -->

&#239;&#187;&#191;<br><br>
<b>Projects</b>

DavidP
07-03-2007, 10:01 AM
Yeah....the thing is...here is how the page is created:



<div id="main">

<!-- MAIN PAGE AREA HERE -->

<?php DisplayContent($_SESSION['CurrentPage']); ?>

</div>




function DisplayContent ( $pageNumber )
{
switch ( $pageNumber )
{
case 0:
include ( "main_content.php" );
break;
case 1:
include ( "programming_content.php" );
break;
case 2:
break;
case 3:
break;
case 4:
break;
case 5:
break;
}
}


programming_content.php is just straight HTML...and it doesn't contain those weird characters in it before the script execution. Neither does index.php contain those weird characters. For some reason they are inserted when script executes.

BUT it doesn't happen for page 0 (the main page), just page 1 (the programming page).

kroiz
07-04-2007, 12:53 AM
I am not a web programmer, but isn't a way to define the page code set.
like in the header somehow define the page to be utf8

Salem
07-04-2007, 01:14 AM
> EF BB BF
http://en.wikipedia.org/wiki/Byte_Order_Mark
My guess is the file you include has the UTF-8 'BOM' in it.

If that file really does contain UTF-8, then the BOM needs to be propagated to the whole document (don't ask me, I only work here).

If there is nothing in the file which needs to be UTF-8 encoded, then edit the file with a regular text editor and see if you can't get rid of the redundant BOM marker at the start of the file.

kroiz
07-04-2007, 01:24 AM
test editor in windows wont show the BOM. you need a hex editor.
here is one.
http://www.softcircuits.com/cygnus/fe/

just remove from the file begining the EF BB BF

coder8137
07-04-2007, 01:43 AM
You should be able to just leave the BOM, if you like, and tell the browser that you're sending UTF-8. (This is because UTF-8 is the same as 7 bit ASCII (for the first 128 chars).) As you're using some version of Apache, just changing this line in your httpd.conf should be enough I believe:


AddDefaultCharset utf-8


If you do not have access to the httpd.conf, try putting the following on the actual web page, inside the <head> section:


<meta http-equiv="Content-Type" content="text/html;charset=utf-8" />


Lastly, you should have a document type at the top of your web page anyways (as the first line, before the <html> tag). Something like:


<!DOCTYPE HTML PUBLIC
"-//W3C//DTD HTML 4.01 Transitional//EN"
"http://www.w3.org/TR/html4/loose.dtd">

Just google DOCTYPE for info.

Anyways, here's the header your server is sending:


HTTP/1.1 200 OK
Date: Wed, 04 Jul 2007 07:23:33 GMT
Server: Apache
X-Powered-By: PHP/4.3.9
Set-Cookie: PHPSESSID=59d7c361ef9ee693c1b6ebb3f923f811; path=/
Expires: Thu, 19 Nov 1981 08:52:00 GMT
Cache-Control: no-store, no-cache, must-revalidate, post-check=0, pre-check=0
Pragma: no-cache
Connection: close
Content-Type: text/html; charset=ISO-8859-1

kroiz
07-04-2007, 04:54 AM
coder8137 (http://cboard.cprogramming.com/member.php?u=23670), that might not be enough because the BOM should be in the begining of the document and in his case it is not.

coder8137
07-04-2007, 06:53 AM
coder8137, that might not be enough because the BOM should be in the begining of the document and in his case it is not.

It may depend on the browser, but I'm not sure. In Firefox 2.0.0.4 (Linux) it works fine (you can change the encoding under the View menu and test it). I can't comment on the others at the moment. But the Wikipedia article Salem linked says this:


If a BOM is misinterpreted as an actual character within the text then it will generally be invisible due to the fact it is a zero-width no-break space.

DavidP
07-04-2007, 11:33 AM
Well...I didn't feel like going into a hex editor and taking out those hex codes, so I just copied and pasted the HTML into a brand new text document, and everything is fine now :-)

coder8137
07-04-2007, 10:49 PM
Good to see that you put in the DOCTYPE, but, considering that I brought it up, I should probably warn you about the choice you made. XHTML is the latest and greatest, but not actually supported by IE 6 or even 7. Right now, you should (probably) be ok, because you're serving the XHTML as "text/html" (as opposed to the other 3 options: "application/xhtml+xml", "application/xml" and "text/xml" (some/all of which will cause a download file box to appear in IE)). However, this means you need to be careful not to actually use some of the advanced features of XHTML.

Also, to be standard compliant for XHTML, you need a xmlns declaration in the <html> tag:


<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">


Note that I'm not a web designer, so I can't tell you much more about some of the cross browser issues. Most people seem to actually do browser matching at the server side and then send out the document as text/html or application/* as appropriate. Here's the standard by the way http://www.w3.org/TR/xhtml1/ . Note that the following line (mentioned in the standard) is optional (and will cause problems with the various versions of IE):


<?xml version="1.0" encoding="UTF-8"?>

CornedBee
07-05-2007, 03:05 AM
Most people seem to actually do browser matching at the server side and then send out the document as text/html or application/* as appropriate.
No, not really. A few people do that, but very few. It's not worth the effort.

Just serve properly formatted, clean HTML 4.01 Strict and you get the best possible support from all browsers. (AFAICR, Mozilla recently fixed the problem that real XHTML wasn't progressively rendered. But I'm not sure if it's in any official release yet.) Only use XHTML (but real XHTML, served as application/xhtml+xml) when you want to do something fancy, like mixing XHTML and other XML languages such as MathML or SVG. Of course, you'll suffer lack of support from IE in this case.

coder8137
07-05-2007, 03:45 AM
I forgot to mention that you need to replace all of these


<br>

with these:


<br />

if you're going to stick with XHTML.





Most people seem to actually do browser matching at the server side and then send out the document as text/html or application/* as appropriate.

No, not really. A few people do that, but very few. It's not worth the effort.

I was referring to those people who actually want to send out valid XHTML and have it work in browsers. But you're right, unless you need the features, it's probably more trouble than it's worth.