HTML has been in use since 1991, but HTML 4.0 (December 1997) was the first standardized version where international characters were given reasonably complete treatment. When an HTML document includes special characters outside the range of seven-bit ASCII two goals are worth considering: the information's integrity, and universal browser display.
Welcome to CWAnswers
CWAnswers is your guide to the sprawling world wide web. The directory aims to provide a useful guide made by users. You can share your knowledge as well - simply sign up and edit your first entry. For questions just contact the team at support - at - cwanswers.com.
Weblinks for Html Codes
Top 10 for Html Codes
Things about Html Codes you find nowhere else.
Select content modules
HTML has been in use since 1991, but HTML 4.0 (December 1997) was the first standardized version where international characters were given reasonably complete treatment. When an HTML document includes special characters outside the range of seven-bit ASCII two goals are worth considering: the information's integrity, and universal browser display.
The document character encoding
When HTML documents are served there are three ways to tell the browser what specific character encoding is to be used for display to the reader. First, HTTP headers can be sent by the web server along with each web page (HTML document). A typical HTTP header looks like this:
For HTML (not usually XHTML), the other method is for the HTML document to include this information at its top, inside the HEAD element.
XHTML documents have a third option: to express the character encoding in the XML preamble, for example
These methods each advise the receiver that the file being sent uses the character encoding specified. The character encoding is often referred to as the "character set" and it indeed does limit the characters in the raw source text. However, the HTML standard states that the "charset" is to be treated as an encoding of Unicode characters and provides a way to specify characters that the "charset" does not cover. The term code page is also used similarly.
It is a bad idea to send incorrect information about the character encoding used by a document. For example, a server where multiple users may place files created on different machines cannot promise that all the files it sends will conform to the server's specification — some users may have machines with different character sets. For this reason, many servers simply do not send the information at all, thus avoiding making false promises. However, this may result in the equally bad situation where the user agent displays the document incorrectly because neither sending party has specified a character encoding.
The HTTP header specification supersedes all HTML (or XHTML) meta tag specifications, which can be a problem if the header is incorrect and one does not have the access or the knowledge to change them.
Browsers receiving a file with no character encoding information must make a blind assumption. For Western European languages, it is typical and fairly safe to assume windows-1252 (which is similar to ISO-8859-1 but has printable characters in place of some control codes that are forbidden in HTML anyway), but it is also common for browsers to assume the character set native to the machine on which they are running. The consequence of choosing incorrectly is that characters outside the printable ASCII range (32 to 126) usually appear incorrectly. This presents few problems for English-speaking users, but other languages regularly — in some cases, always — require characters outside that range. In CJK environments where there are several different multi-byte encodings in use, auto-detection is often employed.



























